managed faster
TRANSCRIPT
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 1/52
Managed Or Unmanaged?
H
ome
A
bout
Wor
k shops
Art
icles
Wr
iting
T
alk s
B
ook s
Co
ntact
Is Managed Code
Slower Than
Unmanaged
Code?
Ask
anyone
the
questio
n above
and
they
will say
that
manage
d is
slower
than
unmana
ged
code.
Are
they
right?
No they
are not.
The
proble
m is
that
when
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 2/52
most
people
think of
.NET
they
think of
other
framew
orks
with a
runtime
, likeJava or
Visual
Basic;
or they
may
even
think
about
interpre
ters.
They do
not
think
about
applicat
ions, or
what
they do;
they do
not
think
about
limiting
factors
like
network
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 3/52
or disk
access;
in short,
they do
not thin
k .
.NET is
not like
those
framew
orks. It
has
been
well
though
out and
Micros
oft has
put a lot
of effort
in
making
it work
well. In
this
article I
will
presentsome
code
that
perform
s some
comput
ationall
yintensiv
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 4/52
e
operatio
n and I
will
compile
it as
both
manage
d C++
and
unmana
ged C++. Then
I will
measur
e how
each of
these
libraries
perform
. As
you will
see,
.NET is
not
automat
ically m
uch
slower t
han
unmana
ged
code,
indeed,
in some
cases it
is much
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 5/52
faster .
Fast Fourier
Transform
Data
that
varies
over
time
(for
exampl
e,music)
will be
the
combin
ation of
various
frequen
cy
compon
ents. A
Fourier
Transfo
rm will
convert
the
time-
varying
data
into its
frequen
cy
compon
ents. I
came
across
Fourier
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 6/52
Transfo
rms
because
I spent
six
years as
a
researc
h
scientist
perform
ing spec
troscop
y
experim
ents.
One
experim
ent I
perform
ed produce
d
an inter
ferrogr
am, that
is, the
sample
under investig
ation
produce
s a
respons
e over
time
when
an
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 7/52
interfer
ence
pattern
generat
ed from
white
light is
shone
on it.
The
interferr
ogram
is the
respons
e over
time but
the
informa
tion
require
d wasthe
respons
e of the
sample
to the
frequen
cy of
theradiatio
n. So, a
Fourier
Transfo
rm was
taken of
the
time-
varying
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 8/52
data to
yield
the
frequen
cy-
varying
respons
e. I was
perform
ing
these
measur
ements
in 1993,
and at
that
time a
PC
running
DOS
was justabout
fast
enough
to allow
me to
take the
time-
varyingdata,
perform
the
Fourier
Transfo
rm and
display
the
frequen
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 9/52
cy
based
results,
all in
real
time,
that is,
as the
measur
ements
were
taken.
The
limiting
factor
in this
progra
m was
the
Fourier
Transform
routine,
because
it
involve
d so
many
calculations.
A
Fourier
Transfo
rm
on N po
ints will
involve
2N 2 co
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 10/52
mputati
ons, so
if you
have a
thousan
d data
points
you will
perform
two
million
comput
ations.
There is
a lot
of theor
y about
Fourier
Transfo
rms,
and Iwill not
go into
details
here,
but the
theory
has lead
to aroutine
called
the Fas
t
Fourier
Transfo
rm (FF
T) that
through
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 11/52
careful
data
manipul
ation it
will
generat
e a
Fourier
Transfo
rm that
will
involve
perform
ing
just 2Nl
og 2 N op
erations
. For
exampl
e, if you
have athousan
d data
points
then
using
the FFT
you will
perform20,000
comput
ations.
The
FFT
routine
still
involve
s
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 12/52
perform
ing
some
trigono
metric
calculat
ions,
and it
involve
s many
numeric
and
array
operatio
ns.
Althoug
h the
FFT is
optimiz
ed
compar ed to
the
Fourier
Transfo
rm, it is
still a
comput
ationally
intensiv
e
calculat
ion and
is a
good
routine
to
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 13/52
exercise
the
perform
ance of
manage
d and
unmana
ged
code.
The Code
There
aremany
algorith
ms
availabl
e, and if
you
perform
a
search
for FFT
you will
get
many
thousan
ds of
hits. I
chose to
use
the Real
Discret
e
Fourier
Transfo
rm byTakuya
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 14/52
Ooura
mainly
because
the
code
was
clear
and
easy to
change.
I made
four
copies
of
Takuya'
s code:
for
unmana
ged
calculat
ions,for
manage
d
calculat
ions
using
Manage
d C++.C+
+/CLI
and for
C#.
The
only
change
I made
to the
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 15/52
unmana
ged
library
was to
change
the
name of
the
functio
n
to fourier
andadd__decl
spec(dllexp
ort) so
that this
method
could
be was
exporte
d from
the
library.
The
Manage
d C++
code
involve
d a few
more
changes
, but
these
were
relativel
y
minor:
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 16/52
• m
et
h
od
s
p
ar
a
m
et
er
s
w
er
e
c
h
a
n
g
e
d
to
ta
k
e
m
an
a
g
e
d
ar
ra
ys,
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 17/52
a
n
d
the
r
o
ut
in
e
s
u
s
e
d
th
e
Ar
ra
y:
:L
en
gt
h
p
r
o
p
er ty
ra
th
er
th
a
n
req
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 18/52
ui
ri
n
ga
s
e
p
ar
at
e
le
n
gt
h
p
ar
a
m
et
er
• tr
ig
o
n
o
m
et
ri
c
r
o
ut
in
e
s
wer
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 19/52
e
u
s
ed
fr
o
m
th
e
Ma
th
cl
a
ss
• th
e
r
o
ut
in
e
w
a
s
e
x
p
o
rt
e
d
a
s
a
st
atic
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 20/52
m
e
m
ber
o
f
a
p
u
bl
ic
cl
a
ss
The
manage
d C++
code
wasthen
convert
ed to
C#,
which
involve
d fairly
minor
changes
(mainly
in the
syntax
to
declare
arrays).
Finally,
the
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 21/52
manage
d C++
code
was
convert
ed to
C+
+/CLI,
this
involve
d a little
bitmore
work,
again,
mainly
because
of the
way
that
arrays
are
declare
d. The
C+
+/CLI
code is
compile
d
as /clr:sa
fe becau
se it
does
not use
any
unverifi
able
constru
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 22/52
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 23/52
will be
a single
spike.
The test
harness
process
takes
two
paramet
ers, the
first
determines the
number
of data
points
that will
be
tested,
and the
second
number
is the
number
of
repeats
that will
be
perform
ed. FFT
routines
work
better if
the
number
of data
points
is a
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 24/52
power
of two,
so the
number
you
give for
the first
paramet
er is
used as
the
power (and
must be
less
than
28).
Each
routine
is
perform
ed a
single
time
without
timing
so that
initializ
ation
can be
perform
ed: the
library
is
loaded
and any
JIT
compila
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 25/52
tion is
perform
ed. This
is
importa
nt
because
I am
interest
ed in
the time
taken to perform
the
calculat
ion.
After
initializ
ation
the
calculat
ion is
perform
ed
within a
loop
and
each
timing
is
stored
for later
analysis
. From
these
timings
the
average
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 26/52
time is
calculat
ed and
the
standar
d error
is
calculat
ed from
the
standar
ddeviatio
n. The
standar
d error
gives a
measur
e of the
spread
of
values
that
were
taken.
Occasio
nally a
roguetime
will
occur
(perhap
s due to
the
schedul
ing of ahigher
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 27/52
priority
thread
in
another
process
) and
these
will
have an
effect
on the
mean.For a
Normal
Distribu
tion the
majorit
y of
values
should
be
within
one
standar
d
deviatio
n of the
mean.
So I
treat
any
value
outside
of this
range as
a rogue
value.
Of
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 28/52
course,
the
mean
and
standar
d
deviatio
n are
calculat
ed
using
therogue
value(s)
, but
their
effect
will be
minimi
zed if
large
datasets
are
used.
So once
the
code
has
calculat
ed the
mean
and
standar
d
deviatio
n it
goes
through
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 29/52
the
dataset
and
remove
s values
that are
outside
of the
accepta
ble
range
andthen the
mean
and
standar
d error
are
calculat
ed on
this
new
dataset.
The C+
+
compile
r will
allowyou to
optimiz
e code
for
space
and for
speed,
so Ihave
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 30/52
written
a
makefil
e that
allows
you to
compile
the
libraries
for both
optimiz
ationsand for
no
optimiz
ation.
The C#
compile
r also
provide
s an
optimiz
ation
switch,
but this
switch
does
not
distingu
ish
betwee
n
optimiz
ation
for
speed
or size,
so I just
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 31/52
compile
d the
optimiz
ed
library
once.
The
results
given
below
are for
all of these
options.
There is
a batch
file that
will
call nmake
for each
option
and
store
the
results
in a
separate
folder.
Themanagedcodeuses a
privateassembly and soit is fullytrusted .In this
casemuch of
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 32/52
the .NETsecuritycheckshave
been
optimized away.
Results
I
perform
ed two
sets of
tests on
two
machin
es
with
.NET
2.0.
One
machin
e had
XPSP2
and had
a single
process
or,
850MH
z
Pentium III,
with
512Mb
of
RAM.
The
other
machine had
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 33/52
build
5321 of
Vista
and had
a single
process
or, 2
GHz
Mobile
Pentiu
m 4,
with1Gb of
RAM.
In each
case I
calculat
ed the
average
of 100
separate
FFT
calculat
ions on
217 (13
1072)
data
values.
From
these
values I
calculat
ed the
standar
d error
from
the
standar
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 34/52
d
deviatio
n. The
results
are
shown
in ms.
The
results
for the
Pentiu
m IIImachin
e are:
Unmanaged
Managed C++
C++/CLI
C# Managed
The
results
for the
Mobile
Pentiu
m 4 are:
Unmanaged
Managed C++
C++/CLI
C# Managed
As you
can see
the
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 35/52
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 36/52
ing this
rough
analysis
on the
values
for
manage
d code
shows
that the
optimiz
ed codeis
barely
faster
than the
non-
optimiz
ed
code.
This
shows
that for
manage
d code,
the
optimiz
ation
perform
ed by
the com
piler
and
linker h
as a
relativel
y small
effect
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 37/52
on the
final
execute
d code,
bear
this in
mind
when
you
read my
conclus
ionsderived
from
these
results.
Interesti
ngly, in
these
tests,
there
are few
differen
ces
betwee
n the
time
taken
for
manage
d code
optimiz
ed for
space
and
speed,
and that
on the
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 38/52
Vista
machin
e code
optimiz
ed
for spa
ce runs
quicker
than
code
optimiz
edfor spee
d .
Note
that the
measur
ements
of a
particul
ar
optimiz
ation
setting
were
taken at
the
aboutthe
same
time, so
the
relative
values
betwee
n thedifferen
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 39/52
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 40/52
with
speed
optimiz
ed
code),
so the
state of
the
machin
e is
more
likely tochange
betwee
n each
process
run than
during
the
process
run.
Thus it
is less
accurat
e to
compar
e results
for
differen
t
optimiz
ations.
The
results
for C#
codeshowed
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 41/52
that
there is
little
differen
ce
betwee
n C#
and
manage
d C++
in terms
of perform
ance.
Indeed,
the
optimiz
ed C#
library
was
actually
slightly
faster th
an the
optimiz
ed
manage
d C++
libraries
.
Now
compar
e the
manage
d
resultswith the
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 42/52
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 43/52
ed for
speed
that the
unmana
ged
code is
faster
than
manage
d code.
The
difference
betwee
n
unmana
ged
code
and C#
code is
just 2%
for the
Pentiu
m
II/XPS
P2
machin
e. There
is also a
2%
differen
ce for
the
Pentiu
m
4/Vista
machin
e, but
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 44/52
here the
C# code
is quick
er .
Conclusions
Think
of .NET
in these
terms.
The
.NET
compile
r
(manag
ed C++
in this
case,
but the
samecan be
said for
the
other
compile
rs) is
essentia
lly theequival
ent of
the
parsing
engine
in the
unmana
ged C+
+
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 45/52
compile
r. That
is, the
compile
r will
generat
e tables
of the
types
and
method
s and perform
some
optimiz
ations
based
on high
level
aspects
like
how
loops
and
branche
s are
handled
. Think
of
the
.NET
JIT
compile
r as the
back
end of
the
unmana
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 46/52
ged
compile
r: this is
the part
that rea
lly kno
ws
about
generati
ng code
because
it has togenerat
e the
low
level
x86
code
that will
be
execute
d. The
combin
ation of
a .NET
compile
r and
the JIT
compile
r is an
equival
ent
entity to
the
unmana
ged C+
+
compile
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 47/52
r, the
only
differen
ce is
that it is
split
into two
compon
ents
meanin
g that
thecompila
tion is
split
over
time. In
fact,
since
the JIT
compila
tion
occurs
at the
time of
executi
on the
JIT
compile
r can
take
advanta
ge of
'local
knowle
dge' of
the
machin
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 48/52
e that
will
execute
the
code,
and the
state of
that
machin
e at that
particul
ar time,to
optimiz
e the
code to
a
degree
that is
not
possible
with the
unmana
ged C+
+
compile
r run on
the
develop
er's
machin
e. The
results
shows
that the
optimiz
ation
switche
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 49/52
s in
manage
d C++
and C#
have
relativel
y small
effects,
and that
there is
only a
2%differen
ce
betwee
n
manage
d and
unmana
ged
code.
Signific
antly,
C# code
is as
good,
or
better
than
manage
d C++
or C+
+/CLI
which
means
that
your
choice
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 50/52
to use a
manage
d
version
of C++
should
be
based
on the
languag
e
featuresrather
than a
perceiv
ed idea
that C+
+ will
produce
'more
optimiz
ed
code'.
There isnothingin .NETthatmeansthat it
should automatically bemuchslower thannativecode,indeed,as theseresults
haveshown
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 51/52
there arecaseswhen managed code is
quicker thanunmanaged code.Anyonewho tellsyouthat.NETshould
be
slower has notthoughtthroughtheissues.
Downloads
The
code for
these
tests is
supplie
d as C+
+/CLI
code
and so
it will
only
compile
for
.NET
2.0.
Downlo
adTop of Form
Bottom of
8/6/2019 Managed Faster
http://slidepdf.com/reader/full/managed-faster 52/52
Form