The probability distribution of number appearances of a ball in a sorted position

Frank

Member
As discussed in another thead the monitoring of skip values for lottery balls can be done in one of three ways. One is the simple noting of the draw number it is drawn in and calculation of skips directly from that. Another way would be to note down the result in DRAWN order and then calculate skip values for each ball in its DRAWN position, creating N different
skip values for each ball if N balls are drawn . With this method all balls are equal in that the probability of a ball being drawn first , second, or last is identical for each ball. Therefore the average theoretical skip is identical for each ball.

Then there is a third method. Monitoring the skip values of each ball in each position when the result has been sorted in ascending order. It is important to be aware that not all balls are equal in each sorted position because the act of sorting creates a probability distribution as to where in order a ball can possibly be after sorting, depending on the values of the other balls. With ball number 1 it is easy to see that it can only be in first position and can only have a skip value in first position.
The below image shows what I mean for a 6/49 lottery. If you were to examine every possible lottery ticket each number can only appear when sorted - that number of times in that position. This has a direct effect on how long you would need to wait to see a number drawn in that sorted position.

http://i41.tinypic.com/111uu61.gif

From such a table, probabilities and theoretical skips follow. So it is important to know the theoretical skip values if you monitor the actual skips on a positional basis on sorted results as they are all different.

I intend to show here the steps required to create a table suitable for most lotteries, you just enter the size:- total number of balls and the number of balls drawn, and the sheet does the rest.

First a note about Excel compatibility. As the active size of the tables are unknown until you enter the two parameters defining your lottery, the sheet was designed to be big enough to cover 59 balls and up to 7 balls drawn. In most cases some rows and columns will be redundant and would display an "#NUM!" message in the cells of redundant rows/columns. To stop this I have used the ISERROR function (Excel 98-2003) to screen for them and blank them out. For later versions of Excel there is a more compact way of dealing with it using the newer IFERROR function. As I said, for compatibility back to Excel 2000 I will be using the lengthier ISERROR method.

OK here goes. The above image should aid understanding.

STAGE 1
Allocate 2 cells G4 for the number of balls drawn (up to 7)... and H4 for total number of balls in the pool (up to 59). Put a bold border around the cells, perhaps highlight their background colour so they are clear. Just for now , so we are on the same page Enter 6 in G4 and 49 in H4. You can change this later when we are all done. As the sheet evolves we will see figures for a 6/49 lottery.

Now we require to construct a small reference table (I'll explain what it does later)
In cell C6 enter the formula =\$G4-1.
In cell D6 enter the formula =C6-1

Copy the formula in D6 across to column I to cell I6. You will probably see ,5,4,3,2,1,-1 in these cells now. Bear with me..

In Cell C7 enter the value 0 (zero). In Cell D7 enter value 1 , in E7 enter 2, in F7 enter 3 and so on .. (numbers 0 to 6 along row 7. I7 will contain value 6 ) Put borders around the cells (like in the image above).

In Cell J4 enter the formula =COMBIN(H4,G4) this is the number of combinations for the lottery and will be used in other formulas. Label it or highlight it. Its an important cell.

The main Ball count table is directly below using columns B to J.

(you may wish to skip this explanation until the table is completed and go straight to consructing the main table)
Now the mini table above (C6 to I7) works out how many columns there are to the left or right of each ball position. For example in the fifth sorted position (column G) if you look above, it shows a 4 and 1, meaning there are 4 columns to the left and 1 (valid) column to the right of that position for that number of balls drawn. Main formulas need this information later.

The formulas in the main table work out how many balls are lower than the ball number we are considering, find out how many places are available to the left of 'where we are now' for those numbers to sit in and works out a combinational figure for how many 'n tuplet' combinations can be less whilst this number sits here. They also work out how many numbers are higher than the number we are considering, find how many places are available for them to sit in and work out another combinational figure of all the possible n_tuplets there can be from this position onwards starting with this ball in this position.
Finally they multiply the two figures together to work out how many times the number we are considering can be here in this position, surrounded by all the available other numbers.
Its bit hard to get your head around, but the formulas work, thats the main thing.

Now for the main ball count/positional table ...

First we need a list of ball numbers in column B
In cells B10 down to B68 enter ball numbers 1,2,3......59. Perhaps highlight them in red.

Right now for the Main table formulas. Copy this to your clipboard,......

=IF(ISERROR(COMBIN(\$B10-1,C\$7)*COMBIN(\$H\$4-\$B10,C\$6)),"",COMBIN(\$B10-1,C\$7)*COMBIN(\$H\$4-\$B10,C\$6))

Click on cell C10 then Click your mouse in the formula bar and paste that formula in, press <enter>
Now copy the formula across row 10 to cell I10 ( you will not see anything appear though).

Select cells C10:I10 and copy the formula down to row 68 thus filling C10:I68 with formulas. Note that the last position column (column I) will look empty because for 6/49 there is no 7th position.

In cell J10 enter (or use copy/paste as above) :- =SUM(C10:I10)
Copy that down column J to cell J68. All the values should be the same in this column, it checks all is well with the values in the other columns if the sums here are equal. Enter a label in Cell A72 "Totals"

In Cell C72 enter the formula =SUM(C10:C71). Copy this formula along row 72 to Cell I72. So now C72:I72 will display
totals for each column. Again they should all be equal if all is well.
Note the diagonal stepping at the top and bottom of the ball numbers to reflect the impossibility of certain balls appearing in certain columns when sorted ascending.

Finally some sorted position headings :- in Cell B9 enter "N" , and in cells Cells C9 through to I9 enter labels A,B,C,D,E,F,G as per the image. FORMAT the headings appropriately.

In cell J9 enter "TOTAL"

This table is now complete. You should now see the number count distribution for all the numbers in all column positions.

All that remains is to format it in approprate colours and do cell borders etc. Add The table title THEORETICAL NUMBER OF OCURRENCES OF SORTED NUMBER N IN COLUMN SHOWN .

If you did it right, it will look similar to my image. For readability I will continue with the next table in the next post....

Frank

Member
STAGE 2

For help in what you should see, view this image. http://i39.tinypic.com/2zi1rx4.gif.

The next table will deal with probabilities, so we need a dynamic title.
Copy this formula to the clipboard:-
=CONCATENATE("PROBABILITY OF OCURRENCES OF NUMBER N IN COLUMN SHOWN ",G4,"/",H4," LOTTERY")

Select cell C75, click in the formula bar and paste the formula in. Press <enter>.
-This will display a title to match the parameters you entered.

Now for the table formulas.
In cell c79 enter the formula (or copy and paste it before) :- =IF(ISERROR(C10/\$J\$4),"",C10/\$J\$4) and press <enter> . It should calculate a value of 0.122. You may need to format the cell as number and set it to display 3 decimal places to see that.
copy cell c79 across row 79 to cell I79 to fill rangeC79:I79 with the formula.

Now select range c79:I79 and copy the formula down to row 137 so that range c79:I137 now contain the formula. Note some cells will appear empty where values are not applicable.

In cell j79 enter the formula :- =SUM(C79:I79) and press <enter>. Copy the formula in cell j79 down the column I to cell J137. All the totals values you see should be identical (0.122) except for redundant rows for larger lotteries than specified ,this checks all is well with the table formulas.

In Cell A140 type in a heading "TOTALS", then in cell C140 enter the formula =SUM(C79:C137) and press <enter>.
Copy the formula in cell C140 across row 140 to cell I140 to fill range(C140:I140) with that formula. All the totals should equal 1.000 except those in redundant column(s) such as I for a 6/49 lottery.

Select cell B78 and Paste the headings here.
Copy the nUmber headings range B10:B68 and paste them into cell B79.

This completes the probability table except you may want to format the colour of alternate rows for example, add borders and colour ball headings etc.

For the third table (of skips) see the next post.

Frank

Member
STAGE 3

To aid you , so you know how (the top half ) should look see this image. http://i41.tinypic.com/vnzedh.gif

Now create the theoretical average skip table.

First the dynamic Title for the table:-
Copy this formula to the clipboard :-
=CONCATENATE("Theoretical Number of draws between apearances in these sorted positions (skips) ",G4," /",H4,"lottery")

select cell P7 and click in the formula bar and paste that formula in. press <enter>
Now the main table formulas:-
Copy this formula to the clipboard :-
=IF(ISERROR(\$J\$4/C10),"",\$J\$4/C10)
select cell P10 click in the formula bar and paste that formula in. Press <enter>

Copy that fromula across row 10 to cell V10 so that range(P10:V10) have formulas in. Select range(P10:V10) and copy the
formulas down to row 68 so that range(P10:V68) have formulas in them. They should calculate a variety of integer values between 8 and 13,983,816 for a 6/49 lottery, but some redundant rows and column(s) may appear empty. Whilst the whole formula range is highlighted, format the cells as number with zero decimal places.

Now the rows and columns need headers. For the column headers, copy range(B9:I9) and then paste into cell O9.
For the ball number header Copy range(B10:B68) and then paste into cell O10.

This table is now complete but may require some formatting of colours and borders adding.

Try changing the lottery parameters in cells G4 and H4 and see the sheet recalculate the new skips etc.

I think that's enough for you, see how you get on. Please report back regardless of whether you have problems or not. I need to know if it works for you and if you have any questions.

PAB

Member
Hi Frank,

OUTSTANDING .
Your instructions were straight forward and easy to follow, I managed to setup the three tables with no problems at all.
Just one point, albeit a title position in Stage 2 ...

In Cell A140 type in a heading "TOTALS", then in cell C140 enter the formula =SUM(C79:C137) and press <enter>.
Like myself, I am sure other members and guests will appreciate the time and effort you have put into creating this.

Regards,
PAB

-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-
12:45, restate my assumptions.
Mathematics is the language of nature.
Everything around us can be represented and understood through numbers.
If you graph the numbers of any system, patterns emerge. Therefore, there are patterns everywhere in nature.

Frank

Member
Hi PAB, thanks for your feedback and for spotting the typo. Despite triple checking it and starting from scratch following my own instructions, that one slipped through the net. I'm just grateful that at my age ( I'm a pensioner in 2 months) I still have enough brain cells to carry on doing this.

There is in fact more that can be done with this project. An identically shaped fourth table can be created holding either the draw number that each ball was last drawn in each position or how many draws since last drawn. Mine uses a top downward sorted results table with the draw numbers on the right and staggered VLOOKUP functions to do this.
From this a fifth table (also identically shaped) can be made to hold a table of overdue values by directly dividing the "draws since last drawn" by the threoretical skip value from table 3.

Regards,
Frank

Frank

Member
I forgot to say, that such a devolopment would be specific to the type of lottery whose results are being tracked, so it would need to be on dedicated copy with locked parameters to that lottery.

PAB

Member
Hi Frank,

My Dad always said, "You're only as old as you feel", that can lead to some interesting times, believe me.

Anyway, when you have recovered from the previous three posts I think ...

An identically shaped fourth table can be created holding either the draw number that each ball was last drawn in each position or how many draws since last drawn ......... a fifth table (also identically shaped) can be made to hold a table of overdue values by directly dividing the "draws since last drawn" by the threoretical skip value from table 3
... would complete this project nicely and wrap it ALL up in one package.
Keep up the excellent work Frank.

Regards,
PAB

-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-
12:45, restate my assumptions.
Mathematics is the language of nature.
Everything around us can be represented and understood through numbers.
If you graph the numbers of any system, patterns emerge. Therefore, there are patterns everywhere in nature.

Frank

Member
Hi PAB,

Well I have a lot on my plate at the moment, I'll consider doing that if I have some spare time, meanwhile I'm wondering who else managed to follow my instructions ?

PAB

Member
Hi Frank,

I must admit that it has been quiet here for a while, but I am sure as people come back and compile the SpreadSheet there will be interest and feedback.

Regards,
PAB

-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-
12:45, restate my assumptions.
Mathematics is the language of nature.
Everything around us can be represented and understood through numbers.
If you graph the numbers of any system, patterns emerge. Therefore, there are patterns everywhere in nature.

Frank

Member
Well PAB, after a month I think its clear that only you and I have an interest in this kind of thing and I reckon I've scared them all off. It has discouraged me from taking this any further, unless others come aboard but I'm sure you have enough experience to do so without my help.

PAB

Member
Hi Frank,

An identically shaped fourth table can be created holding either the draw number that each ball was last drawn in each position or how many draws since last drawn. Mine uses a top downward sorted results table with the draw numbers on the right and staggered VLOOKUP functions to do this.
From this a fifth table (also identically shaped) can be made to hold a table of overdue values by directly dividing the "draws since last drawn" by the threoretical skip value from table 3.
Well PAB, after a month I think its clear that only you and I have an interest in this kind of thing and I reckon I've scared them all off. It has discouraged me from taking this any further, unless others come aboard but I'm sure you have enough experience to do so without my help.
I think you might be right Frank .
If I have some time over the next few days I might revisit this and take it to the next level. I will let you know how I get on.

Regards,
PAB

-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-∏-
12:45, restate my assumptions.
Mathematics is the language of nature.
Everything around us can be represented and understood through numbers.
If you graph the numbers of any system, patterns emerge. Therefore, there are patterns everywhere in nature.

jack

Member
Hello Frank, still in your mind, how would you find the probability of each
Suit or trio positional
Example a lottery 49/6 we have 20 positions to suit
1,2,3 1,2,4 1,2,5 1,2,6 .... = 4,5,6 are up to 20 positions for threesomes or suits
In 49/6 have 18.424 possible suit,
The question is given such a suit
12,23,45, and know the probability of his best position?
The difference is that you did with separate numbers, the question
When the suit, which is three numbers together, as one does the calculation of probability? Can you help me please. thank you

Frank

Member
Hello Jack,

For events which have to occur simultaneously and are dependent on each other as in the situation you suggest, then the individual probabilities need to be multiplied to arrive at the overall probability of the event.
For a triple event where a named ball has to be in one position AND another named ball has to be in another position AND a third named ball has to be in another position P= p1 xP2 X p3.

So If ball number 3 has to be in position 1 AND ball number 10 has to be in postion 3 AND ball number 40 has to be in postion 5 we first look up their individual probabilities in those positions. See the table below fro a 6/49 lottery.

http://i46.tinypic.com/2bnzlw.jpg

From the table:-
the probability of number 3 being in position 1 is 0.098.
The probability of number 10 being in position 3 is 0.024.
The probability of number 40 being in position 5 is 0.053.

The probability of that triple in those exact positions is 0.098 x0.024 x 0.053 =0.00012.
To convert that to odds it is (1-0.00012)/0.00012 =8021 to 1.

Now you did say
The question is given such a suit
12,23,45, and know the probability of his best position?
That would involve calculations of all the probabilities of the 3 numbers 12,23,45 in all the possible combinations of columns and looking for the lowest figure after multiplying the 3 factors for each. In reality it comes down to common sense, since we instinctively know that low numbers have a higher probability of appearing in the first columns, high numbers have a higher probability of appearing in the last columns, whilst numbers like 24,25,26 (the mid point numbers) have a higher probability of appearing in the middle columns.. Three higher factors multiplied together are more likely to have a higher probability such as 1,25,49 positioned in 1st, 3rd and 6th columns p=0.00509 or 1678 to 1.
For your specific example, by inspection of the table just lookup the best factor for number 12 note its column, then find the best factor for 23 from the remaining available columns, and finally the best factor for number 45 from the remaining available columns. Should give you a good idea where they should be.

Frank

Frank

Member
Correction
Frank said:
That would involve calculations of all the probabilities of the 3 numbers 12,23,45 in all the possible combinations of columns and looking for the lowest figure

Frank
Of course you are looking for the highest value of probability, not the lowest, for best position.

jack

Member
Hello Frank!!!perfect, good job, but I have a doubt.
The doubt is when you have two possibilities, in order = raffles, and in ascending order
When assembling the bet,
Example 49,12,24 this suit, this suit can be placed in the 1st, 2nd, 3rd placing aleatoriomente
Now another example if I choose to mount the game in ascending order of probability,
Example 02,12,25, where the limit goes up to the third position of number 01 to 34 for example, al in this case the number 49 just might be the last position! How do you calculate this? Thanks frank

jack

Member
Hello, Frank, can you please make a lottery table to 60/6 (mega sena)
Is equal to 49/6
Because I want to see a larger range of probalidades of each position, a limit of maximum and minimum frequency. In each column as a result it seems that 80% are in the draw! thank you

Frank

Member
jack said:
Hello Frank!!!perfect, good job, but I have a doubt.
The doubt is when you have two possibilities, in order = raffles, and in ascending order
When assembling the bet,
Example 49,12,24 this suit, this suit can be placed in the 1st, 2nd, 3rd placing aleatoriomente
Now another example if I choose to mount the game in ascending order of probability,
Example 02,12,25, where the limit goes up to the third position of number 01 to 34 for example, al in this case the number 49 just might be the last position! How do you calculate this? Thanks frank

Sorry Jack, but I have no idea what you just said.

aleatoriomente ? ?

jack said:
Now another example if I choose to mount the game in ascending order of probability,
Example 02,12,25, where the limit goes up to the third position of number 01 to 34 for example, al in this case the number 49 just might be the last position! How do you calculate this? Thanks frank

I don't understand this either, doesn't make sense to me!

Frank

Member
jack said:
Hello, Frank, can you please make a lottery table to 60/6 (mega sena)
Is equal to 49/6
Because I want to see a larger range of probalidades of each position, a limit of maximum and minimum frequency. In each column as a result it seems that 80% are in the draw! thank you
Do you mean this ?

http://i48.tinypic.com/15ot4k5.jpg

Regards.
Frank

jack

Member
Hello, ok perfect good job, thanks for table 60/6, but because it's the same concept as the lottery 49/6,
Frank, is that when the draw is made each number out randomly in the result,
Then it is placed in ascending order, agree to frank in ascending order of the draw
Seems more promising, careful with the translation of google, ok
Frank, as you might find a suit, among the 20 possible tender, which will be fixed
Both wheels to create, or as the basis for another 19 suits?
What is the probability of finding a suit with greater chances of quitting in the next draw? thank you frank

Frank

Member
Hi Jack,

I think you misunderstand the purpose of this type of analysis. It is for single balls and can give a guide for single balls as to when a ball might be overdue to appear in a given sorted position. The overue statistic for single balls is possible to use because the time periods of expectation are within reasonable periods of time. For example for the UK lottery the number 37 is the most (positionally) overdue ball in the last position (column 6). The average expectation between appearances is (coincidentally) about 37 draws for this ball in this position. As it hasn't appeared for 223 draws it is over 6 times overdue and this suggests that in the not too distant future it is likely to appear in the last place again after sorting. So you can speculate within a few months or weeks (depending on the frequency of the draw) as to when it might appear here again.

As I demonstrated in my earlier calculations, if you try and combine these statistics to try and find out when a triple is positionally overdue, then the expected period between appearances of any triple in a specified position after sorting can run into many thousands of draws. If there is only one draw per week then you might have to wait forty years to find out when the first triple is overdue, and from this predict that it might appear again here in another 40 years. This is totally impractical and in my opinion, completely pointless. I don't know how long you are hoping to live for Jack, but really this line of analysis will unlikely to be of any practical help and is not worth persuing.