gfp workshop

GFP WorkshopUndergraduate Bioinformatics Club (UBIC) at UCSD

Alexander Niema Moshiri

Green Fluorescent Protein:

Origins

Green Fluorescent Protein (GFP) is a naturally-occurring

protein in a species of jellyfish, Aequorea victoria

When excited by blue or ultraviolet light, GFP

fluoresces a green color

A fluorescent Aequorea victoria


A Brief History of wtGFP

GFP has been studied as early as the 1960s

However, its utility for molecular biologists was

not realized until the 1990s

In 1992, Douglas Prasher cloned and

sequenced the wild-type GFP (wtGFP) gene

“Wild-type” = Natural

Prasher proposed using GFP as a biochemical

tracer that allows us to look at the inner

workings of cells

Douglas Prasher


Recombination of wtGFP

The lab of Martin Chalfie expressed wtGFP in E. coli and

C. elegans

To their surprise, wtGFP was able to glow in both

species without needing any jellyfish cofactors

C. elegans expressing wtGFP


Bioengineered

In 1995, by changing a single amino

acid, Roger Tsien engineered the first

improved mutant of GFP with

increased fluorescence and

photostability

Tsien was awarded the 2008 Nobel

Prize in chemistry for his GFP work

He is currently a professor at UCSD

Further improvements to GFP were

made over the next few yearsRoger Tsien


Current State of Mutants

Today, many more derivatives have

been created from GFP and dsRed (a

red fluorescent protein)

Researchers have access to a range

of colors, including green, yellow,

orange, red, violet, blue, and cyan

An illustration of a San Diego beach scene

drawn using 8 colors of FPs

Rainbow of FPs from the Tsien lab


Experimental Uses

We mentioned before that FPs can be used to track

cellular processes

Researchers can simply attach an FP to some object of

interest and then they can visually follow the object

Mice expressing GFP next to normal mice GFP-expressing neurons

Protein Data Bank:

A Brief Overview

The Protein Data Bank (PDB) is a

repository of 3D structural data

of large biological molecules

(e.g. proteins and nucleic acids)

This structural data can be

downloaded and used to render

a 3D image of the molecule of

interest

3D rendering of GFP from PDB data

Protein Data Bank:

Step 1: Querying the PDB

Open Mozilla Firefox and navigate to www.rcsb.org

The search box on the top of the page allows you to

“Search by PDB ID, author, macromolecule, sequence,

or ligands”

Search for the term Green Fluorescent Protein and hit

“Go”

Scroll down and click on entry 4KW4: “Crystal Structure

of Green Fluorescent Protein”

http://www.rcsb.org/

Protein Data Bank:

Step 2: Questions About Results

Who are the authors of the primary citation for 4KW4?

What organism is this protein from?

How long (in amino acids) is this protein?

What method was used to produce this entry’s data?

What is the resolution in Angstroms (Å)?

Protein Data Bank:

Step 3: Rendering 3D Structure

Return to the PDB homepage: www.rcsb.org

In the left-column panel, click “Visualize”

In the box that says “Enter a PDB ID”, enter 4KW4 and

click “View Jmol”

You should see a 3D rendering of GFP

You can click and drag the 3D render to rotate it

http://www.rcsb.org/

Protein Data Bank:

Step 4: Display Customization

Under “Select Display Mode,” click “Custom View”

Cycle through the different Style options and choose

your favorite

My personal favorite is the default, Cartoon

Cycle through the different Color options

You can also change the color(s) by Right-Clicking on the

3D render, going to Color, then Structures, then Cartoon

(assuming you’re still in Cartoon style), and choosing a

color

You can also go to Color Structures Cartoon By

Scheme and choose one of those options

Protein Data Bank:

Step 5: Exporting 3D Image

Finish customizing the 3D image to your liking

Feel free to play with the other options in the menu that

pops up when you Right-Click on the 3D image

If you want to revert to the original settings, just refresh

the page and it will reload with the default settings

When you are ready to export the final image, just click

the blue “Export 3D Image” button, specify a

destination, and click “Save”

Enjoy your cool 3D image of GFP!

Multiple Sequence Alignment:

The FASTA Format

The FASTA format is a text-based format for

representing DNA, RNA, or Protein sequences

A sequence in the FASTA format begins with a single-line

description (beginning with the ‘>’ character), followed

by line(s) of sequence data


Sequence Alignment

A sequence alignment is a way of arranging biological

sequences (DNA, RNA, or Protein) to identify regions of

similarity between the sequences

Gaps can be inserted between characters in the

sequences so that identical or similar characters can be

aligned in the same column

An example multiple sequence alignment


GFP and its Derivatives

In the following activity, we will

align the sequences of GFP and some

of its derivative fluorescent proteins

These proteins’ sequences are

provided in the file named

protein_sequences.fasta

Using the results from the multiple

sequence alignment, we will be able

to construct a phylogenetic tree

This tree will provide us information

about the pairwise “closeness”

between the protein sequences


ClustalW2

ClustalW2 is a popular multiple sequence alignment tool

Download protein_sequences.fasta from:

http://ubic.ucsd.edu/gfp/

Go to the ClustalW2 website:

http://www.ebi.ac.uk/Tools/msa/clustalw2/

Under “STEP 1 – Enter your input sequences”, upload protein_sequences.fasta by clicking “Choose File”

Under “STEP 4 – Submit your job”, click “Submit”

http://ubic.ucsd.edu/gfp/

http://www.ebi.ac.uk/Tools/msa/clustalw2/


ClustalW (Continued)

After you click Submit, ClustalW2 will redirect you to

the results of the multiple sequence alignment

The IDs of the sequences are to the left of the alignment,

and each row of the alignment corresponds to a single

sequence (e.g. the first row of every chunk is

“GFP(4KW4)”)

If the alignment doesn’t make sense to you, be sure to

ask one of the UBIC officers any questions you have!

Evolutionary Relationships:

Phylogenetic Tree

A phylogenetic tree is a branching diagram (or “tree”) that shows relationships of “closeness” between different biological species or other entities

Elements that are closer together on the tree have “closer” (more similar) sequences

In the ClustalW2 results page, click “Send to ClustalW2_Phylogeny”

On the resulting page, under “STEP 3 – Submit your job”, click “Submit”

Draw out the phylogenetic tree (questions will be asked about it on the Extra Credit assignment)

Phylogenetic Trees:

Biological Importance

The information provided by phylogenetic trees is extremely valuable and is even applicable to medicine

In 1994, Richard Schmidt, an American physician, used a sample of blood from one of his AIDS-infected patients to inject into his ex-lover and former colleague, Janice Trahan, infecting her with HIV

HIV DNA was collected from the victim, from the putative patient source, and from thirty-two other unrelated, HIV-positive individuals

Scientists concluded that of all the samples they tested, the two viruses' DNA from the victim and the patient matched almost exactly, even with HIV's potential to mutate very rapidly

Phylogenetic Tree from the

HIV Court Case

GFP Workshop:

Summary

Congratulations on finishing the GFP Workshop!

Throughout the workshop, you learned the following:

GFP’s history and uses

How to use the PDB (and rendering 3D protein structures)

Multiple Sequence Alignment using ClustalW2

Phylogenetic Tree Construction from a Multiple Sequence

Alignment using ClustalW2_Phylogeny

We hope you enjoyed the workshop, and we hope you

have found interest in the field of Bioinformatics!

gfp workshop

Documents