wikilims road4
TRANSCRIPT
1st Next Gen Sequencer
• Centerpiece of a lab
• Generates new workflows– These cannot be known in advance
• When they order 2 more sequencers– Still want a single repository for all runs
Tasks/Workflows
• Production – Few tasks, all repeated many times– Rigorous standards– Ideal for software
• Research– Many one-off tasks– Ad hoc standards– Difficult for software
Sequencer’s Input
• Infinite variety of – Samples, handling and lab prep
• All details might matter– Usually only a few do
The 454 Solution
• A single strict [A-Z0-9]+ field
• Intended as an external primary key
• Makes sample tracking an upstream problem
• Part of the results directory nameR_TIMESTAMP_MACHINEID_USER_YOURFIELD
• Clean technical solution
These are Researchers
• Apparently they wanted a LIMS
• Found a way to cram it in
• PROJIDxxSPECIESxxSAMPLExxDESCxxNOTES
• More or less consistent
Additional Details
• 3 machines
• Signs of strain by the 50th run– Difficult to look across machines
– Too many DESCRIPTION variants– Desire to rename old data
Key Terms
• Wiki – Fast in Hawaiian
• LIMS– Laboratory Information Management System
• Mediawiki– Software that runs Wikipedia
flexible --- database 2/3
‘ ’ . N o need to abuse a com m ents field E verything is a com m entuntilyou m ake it.structured
flexible --- database 3/3
‘ ’ . N o need to abuse a com m ents field E verything is .a com m entuntilyou m ake itstructured
Automatic data capture - Raw
M oststructured contentcan be captured and recorded by program s as itis
generated
Custom HTML ’ Tricks you ve never
. seen w ikipedia do
A dding a record via a.form
/ R un custom perlphp.code
* * G enerate any htm lon . the fly A JA X
User Interface
Traditionally LIMS UI• Must be done up-front• Can be hardest part to get right
Wiki provides a minimal UI• Instantaneous and consistent• Focus on data first• Improve it when and where needed
As Details Emerge
Users can edit data with only a browser• Won’t make 5000 changes by hand
– But 50 is faster and cheaper than calling in a coder
• Write software only for the heavy lifting– Cost effective only if we will do something many
times – Deferred until patterns emerge
• and become tedious
Reading Wiki From Perl
use Perlwikipedia;
$bot = Perlwikipedia->new;
$bot->set_wiki($hostname, $directory);
$bot->login($username, $password);
$pagetext = $bot->get_text("Main Page");
Edit Wiki Pages
@pages = $bot->get_all_pages_in_category(
"Category:Is_a_454_Run");
foreach $page (@pages) {
$oldtext = $bot->get_text($page);
$newtext ="$oldtext changed by bot";
$bot->edit($page, $newtext, $comment);
}
SPARQLPREFIX abc:
<http://mynamespace.com/exampleOntologie#>
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital.
?y abc:countryname ?country.
?x abc:isCapitalOf ?y.
?y abc:isInContinent abc:africa.
}
Select all African capitol cities from wikipedia
DBpedia.org
• Use SPARQL to query directly against wikipedia
• Make a local relational cache • Query with SQL• You hide your SQL behind a layer
anyway…..right?