Hacker News new | past | comments | ask | show | jobs | submit login

Well the rightful answer is "it's complicated" DNA wise yes there are 3 billion bases and they sample about a million of them, so 1/3000 bases.

THing is about the bases

1. most of them are the same in everyone

2. the ones that are different tend to be correlated with each other locally and thus captured by their 1Mb assay

so

3. you can infer all the stuff they didn't sequence with pretty high confidence. Not a "in theory" but more like "has been a typical thing to do in the statistical genetics field for at least a decade"

4. outside of the genome there is the epigenome which may or may not be relevant, it undergoes very specific resets short after fertilization

also worth noting

5. bioinformatics is an imperfect and algorithm based science. Reads are aligned according to error and difference profiles (i.e. string mismatches and the most and least types of mismatches). So unless things are finely calibrated, a level of analysis most bioinformatics don't do, deeply study their read alignment penalties for particular data sets, even a robust bug-free read aligner correctly applying penalties will have bad alignments, false positives, false negatives

anyways I for one hope the futurepopele will clone me from the bytestream




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: