============= Gene Ontology ============= Gene ontology data is represented by two data types; you can find details of these in :ref:`ontologyClassSection`. At present, the only input format supported is that produced by DAVID. Here's an example of loading several DAVID output files and producing a new data frame containing the GO terms, catagories, and significance (p-value) from each enrichment analysis for those terms where at least one of the analyses was significant: >>> import os, sys >>> from pyokit.io.david import david_results_iterator >>> PVAL_THRESHOLD = 0.01 >>> filenames = sys.argv[1:] >>> # load all of the DAVID results for each file >>> by_trm = {} >>> for fn in filenames: >>> for r in david_results_iterator(fn): >>> if not r.name in by_trm: >>> by_trm[r.name] = {} >>> by_trm[r.name][fn] = r >>> # drop terms where no file has p < threshold >>> by_trm = {term:by_trm[term] for term in by_trm >>> if min([by_term[term][fn].pvalue for fn in by_term[term]]) < PVAL_THRESHOLD} >>> # output >>> for term in by_trm: >>> for fn in by_trm[term]: >>> r = by_trm[term][fn] >>> print r.name + "\t" + str(r.pvalue) + "\t" + r.catagory + "\t" + fn This makes use of an iterator for DAVID otuput-format files. Here are the details of that function: .. autofunction:: pyokit.io.david.david_results_iterator