As with most bioinformatic approaches, there are several methods one could employ to examine microbial sequences for the hallmarks of milk utilization. These general protocols represent facile, yet informative, for use by individuals with varying degrees of bioinformatic training.
To determine the distribution of a particular gene of interest (e.g. beta-galactosidase) deposited into public databases one could simply use a BLAST pairwise alignment of regularly updated genbank database: BLAST at NCBI
This could be accomplished with either protein or nucleotide sequences of milk relevant genes.
An example BLASTP query of a bifidobacterial beta-galactosidase of Genbank database (NR) can be found here: Example B-gal BLAST
An additional method for scanning microbial sequences for milk-relevant genes can be accomplished using the The Joint Genome Institute’s Integrated Microbial Genomes (IMG) online platform. This tool is extremely user-friendly and is updated quarterly. For more information on IMG please visit their site and/or refer to PMID: PMID: 16381883
An introductory tutorial for using IMG to scan for potential milk utilization genes in microbes has been created and can be accessed here.
These analyses are routinely utilized to detect homologs or genes decendant from a common ancestral DNA sequence. Although great care must be taken not to over-interpret the results in terms of functional or evolutionary significance. For instance, fucosidases found in soil bacteria are not likely to signal an evolutionary link with milk. The presence of a homolog is only predictive of a biochemical function on milk molecules (e.g. fucosylated milk oligosaccharides).