data science challenges in personal program analysis
TRANSCRIPT
![Page 1: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/1.jpg)
Data Science Challenges inPersonal Program Analysis
Bas van SchaikNew York R Conference (April 2016)
![Page 2: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/2.jpg)
- Cloud service for personal program analysis
- Free for OSS projects
- Currently in private beta, release imminent
![Page 3: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/3.jpg)
Personal Program Analysis: why?
We are passionate about code.
We wish everyone would write better code.
We help people build better software better.
![Page 4: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/4.jpg)
Ehm… Program analysis?
Compiler
![Page 5: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/5.jpg)
What’s an ‘Alert’?
Short answer: a bug or a violation of good coding practice
Example: define the same key twice in a Python dict
E.g. in OpenStack Designate:
self.target = objects.PoolTarget.from_dict({ 'type': 'powerdns', 'options': [{ 'key': 'connection', 'value': 'memory://', 'key': 'host', 'value': '127.0.0.1', 'key': 'port', 'value': 53}],})
My guess of what was intended:
self.target = objects.PoolTarget.from_dict({ 'type': 'powerdns', 'options': [ {'key': 'connection', 'value': 'memory://'}, {'key': 'host', 'value': '127.0.0.1'}, {'key': 'port', 'value': 53}],})
![Page 6: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/6.jpg)
What’s an ‘Alert’?
Alerts are found by queries: ● The source code is our database● Every query result is an alert.
Support for 10 different programming languages (and counting), a total > 1000 queries and metrics.
![Page 7: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/7.jpg)
What does a query look like?
from Method mwhere m.hasName("hashcode") and m.hasNoParameters() select m, "Should this method be called 'hashCode' rather than 'hashcode'?"
![Page 8: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/8.jpg)
Making it interesting: project over timenet alerts
activ
ityco
mpo
sitio
nne
t LO
C
OpenStack Nova (python)
![Page 9: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/9.jpg)
Or: compare different projectsCinder
Nova
Neutron
Horizon
Heat
SwiftSahara
Glance
Designate
Keystone
FuelIronic
aler
ts
LOC
![Page 10: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/10.jpg)
Even more interesting: make it personal
A
X
net LOC contributed (all OpenStack modules)
net
aler
ts
B
![Page 11: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/11.jpg)
Data Science for PPA: finding fun facts
Trailblazer
Bug squasher
Refactorer
None
Major release
Tota
l con
trib
utor
s%
con
trib
utor
s
Who's doing what in OpenStack?
![Page 12: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/12.jpg)
Data science for PPA: cleaningPostgreSQL (net churn and net alerts - before cleaning)
PostgreSQL: after cleaning
![Page 13: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/13.jpg)
Warning:
DEMO of beta software
![Page 14: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/14.jpg)
But… why make it personal?
Some developers not so happy:
“are you questioning my ability to write code?”
No. We're helping you to improve.
![Page 15: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/15.jpg)
But… why make it personal?
By making it personal, we make people care.
When people care, they improve.
When developers improve, the code improves.
![Page 16: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/16.jpg)
But… why make it personal?
When developers improve, the code improves.
● Automated code review on GitHub pull requests
● “On 12/11/2015 you introduced X, fancy fixing that?”
● “You recently fixed alert A in file B. Based on your expertise, you might also be interested in fixing alert X in file Y?”
● “Compared to developers like you, you rank 20 out of 100”
● “… and by fixing these 5 alerts, you'll be in the top 10!”
● Found a bug in your project? Write a query for it, share it!
![Page 17: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/17.jpg)
Not rocket science… Or is it?
![Page 18: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/18.jpg)
DEMO (continued)
![Page 19: Data Science Challenges in Personal Program Analysis](https://reader031.vdocuments.site/reader031/viewer/2022030308/58f046931a28ab324e8b4685/html5/thumbnails/19.jpg)
Interested in…
Early access to CodingStars?
Having your OSS project analysed?
Working for us in New York, San Francisco, Oxford (UK), or Copenhagen (Denmark)?
Talk to us!(in person, or [email protected])