Repeat combination soon in your neighborhood

GillesD

Member
Up to now, 6 Canadian lotteries (Lotto 6/49 with Atlantic 49, Quebec 49, Ontario 49, Western 49 and BC 49) total up 8,640 draws, a small 0.062% of all possibilities. So most would not expect a combination to appear in any two of those lotteries, the same argument many use to say that the same combination will not come out in the near future in a lottery.

But the facts show otherwise. Based on those 8,640 combinations, we find that three (3) combinations have already shown up twice; they are:

A) Numbers 03 – 06 – 37 – 38 – 45 – 47 in both draw #2046 of Lotto 6/49 and draw #853 of Quebec 49;

B) Numbers 06 – 16 – 21 – 29 – 30 – 44 in both draw #821 of BC 49 and draw #213 of Atlantic 49;

C) Numbers 08 – 19 – 28 – 32 – 36 – 45 in both draw #277 of Quebec 49 and draw #314 of Western 49.

So it is quite possible that quite soon, we will see the same combination come out again in a lottery somewhere in this small world.
 

johnph77

Member
Statistically, that's an improbability. With 13,983,816 possibilities, it's only after about 39.45% of draws - in this case about 5,516,615 draws - does the possibility of even one duplicate in all previous draws become a break-even probability. Bears watching.
 

GillesD

Member
Improbability or bad way of looking at it?

johnph77 said:
Statistically, that's an improbability. With 13,983,816 possibilities, it's only after about 39.45% of draws - in this case about 5,516,615 draws - does the possibility of even one duplicate in all previous draws become a break-even probability. Bears watching.

It is true that it appears rather improbable and I was surprised of the results. but this reminded me of two facts:

A - I once ran a quick test to evaluate how long it took to get a repeat number in a 6/49 lottery. In some cases, I got duplicate combinations before 10,000 tries and in most case before 25,000. The method used was not very sophisticated; I used the RANDBETWEEN(1;13983816) to generate random numbers and checking if that number had already come out.

B - It is known that if you randomly take 30 or 31 persons (I don't remember the exact value), then you have more than 50% of the chances to get two persons with the same birthday (day and month), even if you are picking 30 values out of a possibility of 365.

I will look more closely at these two facts. But this is random (very random), I also added the results from UK 49 lottery (1400 draws) and did not get any more duplicate combinations.
 

johnph77

Member
The 39.45% probability factor seems to be a hard and fast number no matter what lottery is run. A while back I did exactly that with the 6/49 matrix in a spreadsheet, then did the MegaMillions 5/56 and Powerball 5/55 matrices as well. It was even true for the Pick 4 so I feel fairly confident that that figure is pretty much a hard and fast number.

I just ran the birthday thingy in a spreadsheet and it turned over 50% at the 144 person mark - again, at the 39.45% mark. However, I've heard the same thing you did, albeit at about the 73-person mark - but that was eons ago and I might not have remembered that number correctly.
 

GillesD

Member
Birthday problem

It appears that the same problem (How many poeple are required to get at least a 50% chance of having the same birthday?) has multiple answers.

You mention 144 persons are needed (the same 39.45% already quoted) but follow this with a 73 persons possibility (more than a 50% reduction).

But this is stil quite higher than my 30-31 persons I had mentionned in my post. But I was wrong with this value (my memory is starting to play tricks on me). I did a search on Yahoo and Google using "birthday problem" as the search value. Many sites were listed (from Wikipedia to some highly statistical ones) and generally the answer to this problem is 23 persons, with in most cases an explanation that appears to make sense.

You will probably understand better than me the probabilities involved and I will leave it to you to explain if this really apply to the problem of one combination of a 6/49 lottery repeating itself fairly quickly.
 

johnph77

Member
A specific combination? No. Any combination occurring in x number of previous draws? Yes.

Thanks for the mental stimulation - I needed that.
 

CMF

Member
A test database I have used for many years is made up of the winning numbers for 6/49 Lotto games from around the world. The date order is retained and is useful for working out various scenarios like how long for a run of even or odd indexes if playing half the possibilities. Search Lotto in Google Groups or here is the link if the webmaster allows it http://groups.google.com/group/lottogroup/web/understanding-the-odds-in-lotto

You will find that in some 20,000 draws at least 20 are repeats.

Regards
Colin Fairbrother

nb To the Webmaster
To revive this site at least a bit I think you should allow some links that are Lotto related, have some relevant content and are just not blatent advertisements. After all this is the internet and cross referencing is good for increasing your ratings.
 

GillesD

Member
Database for duplicate matches

CMF
Is the database you mention for duplicate combinations avallable somewhere? I may want to look at it and maybe use it in some calculations. Is the source of the winning combinations indicated?

By the way, the value you quoted (260,624 combinations) for matching 3 out of 6 numbers and 3 out of the remaining 43, is not fully exact. The exact value (when stated as such) is 246,820.

But the 260,624 combinations is right though, if to the match 3, you add the match 4 (with 13,545 combinations), the match 5 (with 258 combinations) and the match 6 (with 1 combination).

Also you do not have to go through all combinations to get the answer. The formula =COMBIN(6;3)*COMBIN(43;3) will give you 246,820. To get to 260,624 combinations, just add + COMBIN(6;4)*COMBIN(43;2) + COMBIN(6;5)*COMBIN(43;1) + COMBIN(6;5)*COMBIN(43;0) to the preceding formula.
 

CMF

Member
GillesD

You can build the database through a Lotto program advertised on this site as the trial version includes Lotto histories from around the world. Warning: You will find duplicate entries that are incorrect which I pointed out to John Lake years ago and which were not corrected. Most are easy to pick up as they are consecutive - not impossible but astronomically unlikely. The histories are stored in dBase which Excel can handle. From there-on you have to do a bit of work yourself. Alternatively you just go to the Lotto operator sites and download the database which quite often are in Excel format.

The value I gave on a different topic of Coverage Calculation speed for processing 1 line in a Pool 49, Pick 6 Lotto game of 260,624 is correct. How you attribute the reference you made from the following beats me:
"The basic test is to calculate the coverage of say 44 45 46 47 48 49 by iterating through all the 13,983,816 possiblities and counting a match Three. ..."
. The itteration process may involve bypassing but all are considered.

I suggest you read the full article - there is some very interesting code there especially at the end. After reading you should realize there is more to it when calculating the coverage of say, 163 lines in less than 1 second than applying a simple formula.

Don't always assume that others know less than you. If you checked out some of the people that contributed you would find more than one is an expert in this area.

Regards
Colin Fairbrother
 

GillesD

Member
Clarification

CMF

Thanks for the information, but from what you posted, I do not think I will look further into the database with winning numbers of various 6/49 lotteries. With some errors requiring patching, it is not worth the trouble. I already have over 10,000 results (from Lotto 6/49, Atlantic 49, Quebec 49, Ontario 49, Western 49 and BC 49 all in Canada and UK 49) and I have full confidence in the data, most of it verified at some point with lottery commissions or cross-checking with various sites.

When working with numbers, either on a professional basis or playing with lottery numbers, I try never to assume. This is a no-win situation. I will check everything as much as I can and I try to confirm data from at least two sources wherever possible.

So when I read the text "… iterating through all the 13,983,816 possibilities and counting a match Three …", I never assumed that meant also to count match Four, match Five and match Six. I used a formula based approach using the COMBIN function, since it was faster for that purpose. Yes, I could have looped through all combinations but then, I would have added some code to discard combinations with less than three matches or with more than 3 matches. And the results would have been the same: 246,820 combinations. My approach has also another advantage: it can be used to calculate as rapidly the values for match Zero (6,096,454 combinations), match One (5,775,588 combinations), match Two (1,851,150 combinations), match Four (13,545 combinations), match Five (258 combinations) and match Six (1 combination).

My method provided me the answer I was looking for: it required typing only 25 characters without the hassle of programming. But I agree with you: if I had looked for something else or more, then most likely my approach would have been different.

I also agree that quite a few do know more than me, but when somebody, expert or not, posts here, he can expect that someone will check his data and post back if the data is wrong (not the case with you) or leaves to misinterpretation (match Three including match Four to Six is a good example) or else ... This is when my quality background comes into play and I try to clarify the situation.
 

CMF

Member
GillesD

It's a good exercise for an Excel user and a better stimulant to the brain than a cup of tea - green of course. Just in case I was asked a question or two I went through the motions of doing it in Excel.

You need to use msqry32.exe which comes with Excel and is usually located in Program Files/Microsoft Office/Office. I suggest putting a shortcut on your desktop and having a good play with it. If you are going to use the .dbf files from John Lake's program then first create a data source (uncheck "Use Current Directory" after clicking on connect) then create a .dbf table definition calling it say, AllWorld using say, Cn649.dbf. Then all you have to do is execute an SQL to append the various databases, which is simply for the German draws -
INSERT INTO AllWorld ( DRAWDATE, P1, P2, P3, P4, P5, P6, P7 )
SELECT Ge649.DRAWDATE, Ge649.P1, Ge649.P2, Ge649.P3, Ge649.P4, Ge649.P5, Ge649.P6, Ge649.P7
FROM Ge649;
.
Naturally, you need to change Ge649 to whatever database you are appending. The procedure is similar if you are using text files downloaded from the internet websites.

I suggest you try running the last solution for getting the Coverage of 163 lines in VBA using Excel - it's amazingly fast.

I fall into line with using conventional notation for coverage etc in this area of interest which means there should not be a problem with interpretation.

Regards
Colin Fairbrother
 

CMF

Member
GillesD

Looking at your first post in this thread again in effect what you are saying is that the three combinations that have appeared twice in some 8,500 Canadian draws will appear soon 3 times in the some 20,000 All World 6/49 games. No combination has appeared 3 times yet in the All World 6/49 games so the likelihood that the specific 3 combinations that have appeared twice in the Canadian 6/49 games will be the first to do so is highly improbable. Yes I know, this just confirms what Johnph77 stated! In other words you may need wings before you ever see it happen.

Regards
Colin Fairbrother

ps Did you know that 60% of Americans believe angels exist and have wings? Now, you can maybe understand the strange preoccupation of Americans in making movies where someone has a flutter in "heaven" and then comes back to earth as their mother, father, sister, brother whatever.
 

serge

Member
Hello,

I'd like to know if someone has or can create this very useful excel file of 5/56 combination by number position,with their lexicographic number on front of each combination,there is a total of 260 numbers by position.

It would look like this if someone were to do it.

firstable all columns would be at width : 15.

Details for the file display.

For position 1.

1. Cell A1 and B1 would be " merge " together with in it : Position 1.

2. Row 2 would be empty.

3. Cell A3 and B3 would also be " merge : with the number 1 in it.

4. Row 4 would be empty.

5. Cell A5 would have the word : Lexicographic in it.

6. Cell B5 would have the word : Combinations in it.

7. Row 6 would be empty.

8. Cell A7 and up would have the lexicographic numbers list from the combinations of the column B.

9. Cell B7 and up ( column ) would have all the combinations for the number 1 " as position1 ".



It should look like this :



For all the numbers as " position 1 ". (1 to 52)

It will start at columns : A,B with number 1 and will stop at columns : CY,CZ with number : 52.

For all the numbers as " position 2 ". (2 to 53)

It will start at columns : DA,DB with number 2 and will stop at columns : GY,GZ with number : 53.

For all the numbers as " position 3 ". (3 to 54)

It will start at columns : HA,HB with number 3 and will stop at columns : KY,KZ with number : 54.

For all the numbers as " position 4 ". (4 to 55)

It will start at columns : LA,LB with number 4 and will stop at columns : OY,OZ with number : 55.

For all the numbers as " position 5 ". (5 to 56)

It will start at columns : PA,PB with number 5 and will stop at columns : SY,SZ with number : 56.



IF I DIDN'T MAKE MISTAKE IT SHOULD BE RIGHT !!!



If someone can write this Excel file,I would realy appreciate it, Thank you.
sERGE.
 

Frank

Member
I know this is an old thread, but the topic was recently revived by Colin in a different thread and I post my research here.
I used a spreadsheet https://spreadsheets.google.com/cc...yN05lVE1tZnM1Y3c&hl=en&authkey=CICp0PcD#gid=0
on line by line basis to calculate first the original birthday problem based on 365 possible different values to compare. It calculates on a line by line basis the new probabilities as the number of comparisons (between people in the room) is incremented by 1. Eventually it reaaches the 50% level and the number of people in the room can be read off at 23.

Then using the same method I adapted this for lotteres.
It calculates the new odds of a repeat as each new draw is drawn, over 12,000 draws. One could read off the point where the probability of not repeating switches from 0.50002941 to 0.49987197 this happens (for 6/49) at 4403/4404 draws. The spreadsheet also draws a chart of this probability decay on a separate page (Chart1). It is customisable for other lottery types too. The type of lottery is set on the appropriate page, chart1 will update accordingly. Note the online spreadsheet is a bit sluggish to update as it stands. If you download it some graphics features will be lost, but thats Google docs for you.

To back up the calculations and following GillesD's comments in this thread I also ran trials using the Randbetween function to generate random lexolographic indexes, but focussing on limiting the number of draws allowed per trial to figures at and around the expected 4404. So running the trial 1000 times and counting up how many lotteries of 4404 draws has at least one repeated draw one would expect the result of such a trial to be of the order of 500 lotteries which repeat if this calculation is correct, to represent 50%.

Using Excels rnd function, repeated trials of 1000 runs at exactly 4403 draws came out too high at 54% (see the chart on charts2 page ) having at least one repeat. Setting the number of draws lower for each run of 1000 at 4346,4302,4246,4196, etc eventually homed in on about 4200 draws at the 50% repeat level, but after dozens of runs of 1000 times the curve is a bit kinky. I suspect hundreds of such runs might be required to smooth it out.

Following Colins article on the use of the Mersenne twister algorithm for such trials I decided to try it.
Using exactly the same approach but with the MT algorithm - new data was obtained and if you look at the charts on Charts2 page the difference is clear. The red curve plots the (average) counts of repeats -equivalent to percentages if you divide by 10, for multiple 1000 draws lotteries, each pre set at a fixed number of draws as shown on the horizontal axis. The first graph shows an intersection at 500 (50%) at around 4400 draws.

To get more detail closely clustered about this exact number, I extended my trial to plot points for pre set lotteries of 4401 4402,4403,4404,4405 draws. Its amazing that one draw difference (averaged over thousands of lotteries) does show a trend upwards as more draws are in the lottery. You will notice that the intersection with the 500.00 line is not at 4403 or 4404 draws. it is between 4401 and 4402 draws.
I belive the mathematics of the calculation more than the Mersenne twister - superior as it is to the Rnd function, it may not be perfect. For practical purposes I'm happy for a 6/49 lotto that around 4404 draws will give a 50% probability of a repeat draw appearing.

Frank
 

Sidebar

Top