kuba tatarkiewicz phd, r&d global director · © © 2019 horiba, ltd. all rights reserved.2019...
TRANSCRIPT
-
© 2019 HORIBA, Ltd. All rights reserved.© 2019 HORIBA, Ltd. All rights reserved. 1
How to present and compare dataobtained by particle tracking analysis
and other related methods
Kuba Tatarkiewicz PhD, R&D Global Director
3-6-2019
-
© 2019 HORIBA, Ltd. All rights reserved. 2
• Measuring sizes of individual particleshydrodynamic diameter dh > d0 due to diluent drag
• Counting tracked particles in a given volumeinvestigated volume depends on size and refractive index
• Testing small volume of a sample• statistical process of estimating particle size distribution• volume tested about 100,000x smaller than sample volume
Particle tracking analysis
-
© 2019 HORIBA, Ltd. All rights reserved. 3
• Definition of CN: counts per volume [part/mL]other defs: mass CM [mg/mL] or volume CV [μL/mL]
• Typically CN is given for a range of sizes e.g. between 100 nm and 1 μm diameters
• Hence use of plots with size bins• bin widths not necessarily equal• some software do not support unequal bins (notably Excel)
Concentration measurements
-
© 2019 HORIBA, Ltd. All rights reserved. 4
• Error in counting proportional to this applies to total counts and individual bin counts
• Number of bins depends on expected errorsN=10,000 and 100 bins, average error 10 counts/bin or 10%
• To obtain “nice”, smooth distributions:• decrease number of bins and use wider bins• use narrow bins to calculate parameters of a distribution
Statistical considerations
𝑁𝑁
-
© 2019 HORIBA, Ltd. All rights reserved. 5
Example plots of size distributions
Counts Density of PSDlogarithmic bins
Cumulative
0
200
400
600
800
1,000
1,200
1,400
1,600
0 400 800 1,200 1,600 2,000
Cou
nts
per b
in [N
]
Diameter [nm]
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
10 100 1,000
Cum
ulat
ive
coun
ts
Diameter [nm]
D50=217 nm
-
© 2019 HORIBA, Ltd. All rights reserved. 6
• Definition of a bin:• d ∈ [bi, bi+1[ where i=1,…,N• equivalent definition: bi ≤ d < bi+1 (non-overlapping bins)
• typically b1 = 0 and bN = max size to be measured• Strange bins definition encountered:
Binning
-
© 2019 HORIBA, Ltd. All rights reserved. 7
• Concentration is just a rational number, also per bin• Histogram uses area to represent a distribution• With unequal bins natural measurable is:
density of particle size distribution (PSD)units: number of particles per bin width and
per investigated volume [part/nm/mL]total concentration ≡ area of a histogram
What is really plotted
-
© 2019 HORIBA, Ltd. All rights reserved. 8
• When plotting concentration vs. size,DO NOT connect points*
• Total concentration is a sum of numbers, not an integral• When plotting density of PSD, one can connect
middle of bins ☛ linear approximation• Total concentration CN is then
an area under the curve
Dimensional analysis
*There’s no data between points – by definition of a bin
-
© 2019 HORIBA, Ltd. All rights reserved. 9
• Create a list of measured sizestypically diameters of nanoparticles are given in [nm]
• Bin those sizes in specific binningtypically logarithmic binning
(bi+2 – bi+1 ) / (bi+1 – bi ) = const
• Calculate density of PSD [part/nm/mL]; V0 needed• Plot as a histogram of density of PSD• Calculate parameters of a distribution (AV, SD, D50 ...)
Practical procedure
-
© 2019 HORIBA, Ltd. All rights reserved. 10
Example of real data in logarithmic binning
Concentration from 50 nm to 700 nm = area of histogram of density of PSD
Den
sity
of P
SD
-
© 2019 HORIBA, Ltd. All rights reserved. 11
• Most standards of size are mono-dispersethere are no standards for concentration…
• Typically real life samples are poly-dispersee.g. multiple sizes or continuous distributions
• Natural samples like sea water or blood:• highly poly-disperse with dominating small particles
• Junge distribution N(d) ∼ d-x where x=3.5÷4
Real data vs. standards
-
© 2019 HORIBA, Ltd. All rights reserved. 12
• Fitting unknown distribution of sizes • Assumed binomial(s) distribution• Height (or area) of different peaks
is NOT conserved during fitting(not invariant of any fitting procedure)
• Looks great but lacks numerical accuracy• Artificial peaks can be created
Fitted data problem (NanoSight FTLA)
cf. van der Pol et al. J Thrombosis and Haemostasis, 12, 1182 (2014)
-
© 2019 HORIBA, Ltd. All rights reserved.
• Concentration
• Average size
• Standard deviation
• D50
13
Parametric description of distributions
Dk =1
𝑁𝑁𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡�𝑖𝑖=1
𝑘𝑘
𝑛𝑛𝑖𝑖 definition Dk = 0.5☛ k☛ bk equal D50
𝑆𝑆𝑆𝑆 =1
𝑁𝑁𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡�𝑖𝑖=1
𝑁𝑁
𝑛𝑛𝑖𝑖 ∗ 𝑑𝑑𝑡𝑡𝑎𝑎𝑎𝑎𝑎𝑎𝑡𝑡𝑎𝑎𝑎𝑎 − 𝑏𝑏𝑖𝑖 +𝐵𝐵𝑖𝑖2
2
Bi = (bi+1 - bi) 𝑁𝑁𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 = �𝑖𝑖=1
𝑁𝑁𝑛𝑛𝑖𝑖𝐵𝐵𝑖𝑖𝐵𝐵𝑖𝑖 = �
𝑖𝑖=1
𝑁𝑁
𝑛𝑛𝑖𝑖
𝑑𝑑𝑡𝑡𝑎𝑎𝑎𝑎𝑎𝑎𝑡𝑡𝑎𝑎𝑎𝑎 =1
𝑁𝑁𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡�𝑖𝑖=1
𝑁𝑁
𝑛𝑛𝑖𝑖 ∗ (𝑏𝑏𝑖𝑖 +𝐵𝐵𝑖𝑖2
)
-
© 2019 HORIBA, Ltd. All rights reserved. 14
D50 & mode depend on binning!
most accurate
which one is mode?which one is mode?
-
© 2019 HORIBA, Ltd. All rights reserved. 15
• Parametric description using AV and SD
is not accurate or unique!
Anscombe’s quartet – same AV and SD, various shapes
• Even higher moments do not give enough info
• Basic experimental question:How similar are two measured distributions?
Lies, damned lies, and statistics
-
© 2019 HORIBA, Ltd. All rights reserved. 16
Kolmogorov-Smirnov statistics
max
Non-parametric test:Kolmogorov-Smirnov statistics
dav = 256 nm, SD = 145 nm, CV = 0.57dav = 163 nm, SD = 68 nm, CV = 0.42
Comparing parameters:
Sheet1
DA,BalphaDA,B,αReject?
0.23350.050.0338yes
-
© 2019 HORIBA, Ltd. All rights reserved. 17
Anderson-Darling statistics
Area between two cumulatives is a good measure of distance between two or more distributions with unknown shape.
-
© 2019 HORIBA, Ltd. All rights reserved. 18
• Normalize area between cumulativesextreme case: a) particles at 10 nm, b) particles at 1000 nm
• Distance is a number from the range [0,1]same distributions have distance 0extreme case – distance 1
• If areas are calculated between distributions in same normalization, it can be a good measure
Distance between distributions
d
cum
ulat
ive
-
© 2019 HORIBA, Ltd. All rights reserved.
DankeБольшое спасибо
Grazie
감사합니다
Obrigado谢谢
ขอบคณุครบั
ありがとうございました
धन्यवाद
Cảm ơn
Tack ska ni ha
Thank you
Merci
Gracias
Dziękuję
Σας ευχαριστούμε
நன்ற
19
How to present and compare data�obtained by particle tracking analysis�and other related methodsParticle tracking analysisConcentration measurementsStatistical considerationsExample plots of size distributionsBinningWhat is really plottedDimensional analysisPractical procedureExample of real data in logarithmic binningReal data vs. standardsFitted data problem (NanoSight FTLA)Parametric description of distributionsD50 & mode depend on binning!Lies, damned lies, and statisticsKolmogorov-Smirnov statisticsAnderson-Darling statisticsDistance between distributionsSlide Number 19