I posted some response at the rng but a more detailed answer follows.

A Lotto Game is a Uniform Distribution.

In your case y axis shall be the Standard Deviation of Draws and

the x axis shall be the sum of Numbers of each draw.

The central moments of the graph are shown on the link below:

http://mathworld.wolfram.com/UniformDistribution.html
Your Graph space shall be as follows:

1 2 3 4 5 6 21 1.870828693

1 2 3 4 5 49 64 18.83259586

1 2 3 47 48 49 150 25.21110866

1 45 46 47 48 49 236 18.83259586

44 45 46 47 48 49 279 1.870828693

Now you can add within this space the points of your Lotto game (6/49)

in this case.

The same method can be used for analysis of n-Tuplets.

If you follow the above you should arrive to the method that is

described as k-nearest neighbour with a sort description below.

k-nearest neighbour

k-nearest neighbour is a clustering technique. As mentioned above, points or records near each other in a multi-dimensional space share many properties. Therefore, the behaviour of these near neighbour records can be used to predict the behaviour of the record under investigation. By calculating distance of near neighbours, a measure for the strength of the relationship can be found. k refers to the number of centroids or cluster points identified.

Generally, the shortest distance between two records (points) is a straight line (excepting special cases such as the shortest distance over the surface of a sphere). In two dimensional space (Euclidean distance), if two records are regarded as two of the points in an isosceles triangle, their distance apart is given by the formula:

D = (x² + y²)

Where x and y are the difference apart on the x and y axis (which have been normalised to produce meaningful results). The smaller D is, the nearer the two records and a more reliable predictor of behaviour they are likely to be. 10-nearest neighbour would calculate the average behaviour of the 10 neighbours and use this average as a predictor for the record under investigation. There are other methods of measuring distance e.g. Mahalanobis distance, but they are beyond the scope of this work.

One problem of nearest neighbour techniques is complexity. k-nearest neighbour does not scale very well into large data sets. Each record in a database with a thousand records must be compared with every other record creating a million comparisons. Secondly, k-nearest neighbour is not a Data Mining method that learns - it merely searches for records with similar attribute values and assumes that their behaviour will be similar depending on the their distance apart. Since the search technique is actually based on the data, it is one of the purest techniques since the search cannot be polluted. k-nearest neighbour should also improve on naive prediction (as should any method considered).

In k-nearest neighbour, each attribute of a record represents a dimension in a multi-dimensional space e.g. marketing records containing attributes for age, gender, marital status, number of children, income, employment status, employment type, car ownership, house ownership, house type, credit card ownership, liabilities, assets, etc., all of which can be justified from a marketing perspective, has an equal number of dimensions (13 in this case). A multi-dimensional space with 20 dimensions with a million records is extremely sparsely populated and k-nearest neighbour does not work well. In addition, and a very real problem for k-nearest neighbour, each record (point) in the space is an almost equal distance from any other point. Since k-nearest neighbour relies on the distance between records to predict their similarity and hence their behaviour, this undermines the whole rationale behind k-nearest neighbour. In contrast a million records in a three dimensional space, for example using just income, age and gender from the example before, is reasonably crowded and can produce meaningful results. This can be overcome by specifying which attributes are important in which Data Mining searches and mining only on those attributes.

Generally, as discussed earlier, Data Mining algorithms should have a complexity metric of n(log n), therefore k-nearest neighbour does not scale up well (since the number of comparisons that are made expands polynomially) and is used on already identified sub-sets of data.

More info:

W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, ``Numerical Recipes'', 2nd edn., Cambridge University Press, Cambridge (1992).

http://www.informatics.ed.ac.uk/teaching/modules/lfd1/books.html
http://courses.bus.ualberta.ca/mgtsc461-erkut/2000/html/software.htm
xls examples with code