gsc2.2 classification

15
GSC2.2 Classification GSC II Annual Meeting October 2001

Upload: martin-kane

Post on 01-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

GSC2.2 Classification. GSC II Annual Meeting October 2001. Single Plate Classification. Decision tree classifier: Use ranks to handle plate to plate variation 5000+ objects in training set OC1 oblique decision tree (Murthy et al) Build several decision trees & let them vote - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GSC2.2 Classification

GSC2.2 Classification

GSC II Annual Meeting

October 2001

Page 2: GSC2.2 Classification

Single Plate Classification

Decision tree classifier:– Use ranks to handle plate to plate variation– 5000+ objects in training set– OC1 oblique decision tree (Murthy et al)– Build several decision trees & let them vote– Classification categories star / nonstar / defect

Page 3: GSC2.2 Classification

GSC2.2 Classification

Unlike astrometry and photometry, where one best value was selected per object (per bandpass),

GSC2.2 classification can combine multiplate information to improve the final classifications,

And counter some known weaknesses.

Page 4: GSC2.2 Classification

MultiPlate Voting

For each object:• Collect all single-plate measurements

– Even from plates not being exported, eg IV-N

• Override defect->nonstar if N(obs)>1– Matched objects likely to be real objects

• Eliminate 25um scan data, if 15um data exist– Classifier poorly tuned for these scans

• Majority vote of remaining measurements– Voting classifiers is known to improve results

• Break ties in favor of nonstars– Compensates for known bias

Page 5: GSC2.2 Classification

Auxiliary Information: the Source Status Flag

• GSC2.2 provides a wealth of additional information about each object via the source status flag.

• Much of this information is pertinent to the quality of the final classification.

• Informed users can further optimize their results (eg, guide star selection) with this auxiliary data.

Page 6: GSC2.2 Classification

Status Flag Details: 0987654321

10 digit decimal mask with relevant infoColumns 0: blend status 9: incomplete processing 8: classification voters 7: classification unanimity 654: photometric details (V,J,F) 3: centroider details 21: number of plate observations

Page 7: GSC2.2 Classification

Classification and the Status Flag0: blend status

– Poorly tuned for blends => lower confidence

9: incomplete processing– No features computed => lower

confidence 8: classification voters

– Multiple voters => higher confidence– 25um voters => lower confidence

Page 8: GSC2.2 Classification

Classification and the Status Flag

7: classification unanimity– Unanimous vote => higher confidence

654: photometric details (V,F,J) 3: centroider details 21: number of plate observations

– More voters => higher confidence

Page 9: GSC2.2 Classification

Bright Objects

• Tycho stars are included in the GSC2.2. – Classification was set to star for these objects– Status flag = 9999999900 for Tycho stars

• GSC1 data were omitted from the GSC2.2– Classifications were excluded from voting– GSC1 classifier superior for m<14

• Include GSC1 classification in next export

Page 10: GSC2.2 Classification

Evaluating Performance: Not a simple problem

• What to measure?– Correctness; completeness; contamination

• Magnitude and latitude variations• What to compare against?

– GSCII was constructed because there is nothing comparable to it!

– Nonstar <> galaxy– Automatically classified samples are less reliable– Visually classified samples are few and small

Page 11: GSC2.2 Classification

NPM/SPM Starsvs magnitude & latitude

Page 12: GSC2.2 Classification

NPM/SPM Galaxiesvs magnitude & latitude

Page 13: GSC2.2 Classification

SDSS Stars and Galaxiesvs magnitude

Page 14: GSC2.2 Classification

Accuracy vs the real questions

• How complete is my sample of nonstars?

• How pure is my sample of stars?

• What is the probability that the GSC2.2 classification of this object is correct?

The answers depend on your sample, as well as on the properties of the catalog.

A single quoted accuracy does not suffice.

Page 15: GSC2.2 Classification

Accuracy vs the real questions

P(Ts|S) = [P(S|Ts)*P(Ts)] / P(S)

This formulation is:– Responsive to magnitude and latitude variations– Adaptable to a priori effects of sampling– Adaptable to your favorite galaxy model– Computable (we think! - in progress)

– Answers the real questions.