linux kernel development: getting started copyright under gplv2 guan xuetao may, 2012...

61
Linux Kernel Development: Getting Started Copyright under GPLv2 Guan Xuetao <[email protected]> May, 2012 http://mprc.pku.edu.cn/~guanxuetao/linux/

Upload: winfred-richardson

Post on 25-Dec-2015

231 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1
  • Linux Kernel Development: Getting Started Copyright under GPLv2 Guan Xuetao May, 2012 http://mprc.pku.edu.cn/~guanxuetao/linux/
  • Slide 2
  • Abstract/Goal Abstract/Goal Linux development is fast-paced and [as they say in Oregon] things are different here. This tutorial introduces some of the Linux culture and how to succeed when working with the Linux development community. The kernel's development process Differs greatly from proprietary development methods May come across as strange and intimidating to new developers But there are good reasons and solid experience behind it
  • Slide 3
  • Number crunching > 15 million 38083 ~1000 190 > 100 ~80 1 SLOC in Jan. 2012 Files in v3.3 Contributors for each release Subtrees in linux-next Patches applied per day Days: a new major kernel release Linus Torvald One of the largest and most active free software projects in existence
  • Slide 4
  • Topics Open source development style, values, culture Linux rapid development cycle How to get your change into the linux kernel Communications methods Some best known practices Getting involved
  • Slide 5
  • Open source development style, values, culture
  • Slide 6
  • Development Style, Values, and Culture Learning curve, things are different Meritocracy good ideas & good code are rewarded Chance to work on a real OS any parts of it that interest you Massive amounts of open communication via email, IRC, etc. (i.e., not private)
  • Slide 7
  • Linux Culture (1) Work in open, not behind closed doors (in smoke- filled rooms) Community allegiance is very high Do what is right for Linux Meritocracy: good ideas and good code are rewarded Often driven by ideals and pragmatism, bottom-up development Not driven by marketing requirements Don't just take, give back too: Modifications are & remain GPL (if distributed) Payment in kind, self-interest Improve software quality, features used/understood more
  • Slide 8
  • Linux Culture (2) Committed to following and using standards (e.g., POSIX, IETF) Committed to compatibility with other system software Informal design/development: Not much external high-level project planning or design docs (maybe some internally at companies); can appear to be chaotic New ideas best presented as code, not specifications or requirements RERO: Release Early, Release Often -- for comments, help, testing, community acceptance Possible downsides: flames, embarrassment
  • Slide 9
  • Linux Culture (3) Development community is highly technical Motivated and committed, but since many are volunteers, treat them with respect and ask/influence them, don't tell Continuous code review (including security) Continuous improvement Have fun!! :) Follow the culture
  • Slide 10
  • Linux Development Values Scratch your own itch Weekenders -> big business Code, not talk Pragmatism, not theory Thick skin Code producer makes [most] decisions Pride, principles, ethics, honesty Performance Hardware & software vendor neutral Technical merit, not politics, who, or money Maintainability & aesthetics: clean implementation, not ugly hacks (coding style) Peer review of patches (technical & style) Contributions earn respect
  • Slide 11
  • Some Things to Avoid Patents, binary modules, NDA Proprietary benchmarks Huge patch files Adding more IOCTLs Marketing Design documents Mention of accomplishments outside of the open source world No patch rationale How do I intercept a system call (or replace a syscall table entry)? Making demands instead of requests This {driver / feature} must be merged, it's important to our company. Date or release version requirements
  • Slide 12
  • Linux rapid development cycle linux/Documentation/development-process/2.Process
  • Slide 13
  • How the development process work A loosely time-based release process A new major kernel release happening every two or three months A rolling development model Patches Files changed InsertionsDeletions v3.0Jul. 21, 2011 v3.1Oct. 24, 201193809181726251602017 v3.2Jan. 4, 2012126951260816454471417264 v3.3Mar. 18, 20121141610698598640431219 From v3.0 to v3.3: 241 days, 139 patches per day
  • Slide 14
  • Typical development cycle Merge window The major changes or new features pulled Deemed to be sufficiently stable (already accepted by the development community) At a rate approaching 1,000 per day Lasts for approximately two weeks The fist -rc kernel released, merge window closed New -rc kernels released about once a week Over the next six to ten weeks Only patches which fix problems should be submitted The patch rate will slow over time Normally, somewhere between -rc6 and -rc9 The final 3.x release The whole process starts over again
  • Slide 15
  • The v3.3 development cycle All dates in 2012Patches v3.2Jan. 4Stable release v3.3-rc1Jan. 19Merge window closed9460 v3.3-rc2Jan. 31515 v3.3-rc3Feb. 8288 v3.3-rc4Feb. 18370 v3.3-rc5Feb. 25198 v3.3-rc6Mar. 3215 v3.3-rc7Mar. 10241 v3.3Mar. 18Stable release129 sum74 days11416 Bug Convergence
  • Slide 16
  • The stable/longterm kernel trees The "stable team: Greg Kroah-Hartman The ongoing maintenance of stable kernels The 3.x.y numbering scheme Cc: stable To be considered for an update release, a patch must 1.fix a significant bug 2.already be merged into the mainline for the next development kernel The "long term" kernels Purely a matter of a maintainer having the need and the time to maintain that release The current long term kernels and their maintainers 2.6.27Willy Tarreau(Deep-frozen stable kernel) 2.6.32Greg Kroah-Hartman(be picked up by Ubuntu to 2015) 2.6.34Paul Gortmaker(Wind River) 2.6.35Andi Kleen(Embedded flag kernel) 3.0Greg Kroah-Hartman(for 2 years at the minimum) 3.0.x-rtxSteven Rostedt(Real-time) 3.2Greg Kroah-Hartman(Ubuntu 12.04)
  • Slide 17
  • The lifecycle of a patch Design Specifying the problem Early discussion Early review Posted to the relevant mailing list Wider review To ready for mainline inclusion Show up in the maintainer's subsystem tree Show up into the -next trees Follow through Persistent in updating the patch to the current kernel so that it applies cleanly Keep sending it for review and merging Merging into the mainline Stable release Long-term maintenance
  • Slide 18
  • How patches get into the kernel Mainline Kernel.org Linus Torvalds ARM Russell King asm-generic Arnd Bergmann Networking David S. Miller Networking drivers [email protected] Wireless Networking John W. Linville DOCUMENTATION Randy Dunlap Staging Greg Kroah-Hartman AKPM Andrew Morton -next tree Stephen Rothwell Contributors Subsystem maintainers
  • Slide 19
  • The lieutenant system built around the chain of trust Kernel series maintainers Linus Torvalds Benevolent dictator Avoid sending patches directly to Linus Subsystem maintainers Gatekeepers, integrators, tiebreakers or overrulers Coordinate subsystems and maintain consistency Share the load Lower-level trees maintainers Chain of repositories Don't have absolute authority Driver maintainers Other contributors Find the right maintainer
  • Slide 20
  • The linux-next tree & staging tree The linux-next tree Maintained by Stephen Rothwell Up to 190 trees The primary tree for next-cycle patch merging A snapshot of what the mainline is expected to look like after the next merge window closes Subsystem trees are collected for testing and review Recreate/test automatically everyday The staging tree Maintained by Greg Kroah-Hartman For drivers or filesystems that are on their way to being added to the kernel tree live A TODO file should be present Code in staging which is not seeing regular progress will eventually be removed
  • Slide 21
  • Summary for development cycle Rapid development cycle, no timelines/schedules Only online documentation has a chance of being up- to-date Accommodate large changes and high rate of change without regressions Open discussion (mailing lists, archives, not private) RERO, facilitates testing on a large variety of platforms Maintainers available and accessible, don't disappear for long periods of time Test suites Bug tracking
  • Slide 22
  • How to get your change into the linux kernel linux/Documentation/SubmittingPatches
  • Slide 23
  • Creating and sending your change 1. diff -up 2. Describe your changes 3. Separate your changes 4. Style check your changes 5. Select e-mail destinations 6. Select your CC (e-mail carbon copy) list 7. Just plain text. 8. E-mail Size 9. Name your kernel version 10. Dont get discouraged. Re-submit. 11. Include PATCH in the subject 12. Sign your work 13. When to use Acked-by: and Cc: 14. Using Reported-by:, Tested-by: and Reviewed-by: 15. The canonical patch format 16. Sending git pull requests
  • Slide 24
  • 1/16: diff -up Make patches against linux-next or Linus's tree Unless they only apply to some other tree or patchset To create patches (unified diff format) diff -uprN git diff In linux/ Documentation/dontdiff .gitignore Make sure your patch does not include any extra files which do not belong in a patch submission. Make sure to review your patch after generated it with diff, to ensure accuracy.
  • Slide 25
  • 2/16: Describe your changes Describe the technical detail of the change(s) your patch includes. Be as specific as possible. The WORST descriptions possible include things like: "update driver X" "bug fix for driver X" "this patch includes updates for subsystem X. Please apply." Served as a commit log If the patch fixes a logged bug entry, refer to that bug entry by number and URL.
  • Slide 26
  • 3/16: Separate your changes Separate logical changes into a single patch file Patch series: an ordered sequence of multiple, related patches For example: Bug fixes and performance enhancements for a single driver An API update and a new driver which uses that new API MUST keep bisect-able git bisect
  • Slide 27
  • 4/16: Style check your changes Coding Style Documentation/CodingStyle Keep it looking like Linux code for readability, maintainability, debugging, etc. Check your patches with the patch style checker prior to submission, and justify all violations ./scripts/checkpatch.pl
  • Slide 28
  • 5/16: Select e-mail destinations Best choice: Look through the MAINTAINERS file and the source code ./script/get_maintainer.pl Next choice: If no maintainer is listed, or the maintainer does not respond, send to the primary Linux kernel developer's mailing list [email protected] [email protected] With no other choice: Linus Torvalds is the final arbiter of all changes accepted into the Linux kernel. (Avoid sending him e-mail.) [email protected] [email protected] Do not send more than 15 patches at once to the vger mailing lists!!!
  • Slide 29
  • 6/16: Select your CC list Unless you have a reason NOT to do so, CC [email protected] [email protected] Other mailing lists for specific subsystems, such as USB, framebuffer devices, the VFS, the SCSI subsystem, etc. http://vger.kernel.org/vger-lists.html http://vger.kernel.org/vger-lists.html If changes affect userland-kernel interfaces A man-pages patch to the MAN-PAGES maintainer (as listed in the MAINTAINERS file) Or at least a notification of the change For small patches you may want to CC the Trivial Patch Monkey [email protected] [email protected]
  • Slide 30
  • 7/16: Just plain text No MIME, no links, no compression, no attachments. Submit e-mail "inline" It is important for a kernel developer to be able to "quote" your changes, using standard e-mail tools. Be careful of your email client (and your editor) Be wary of your editor's word-wrap corrupting your patch, if you choose to cut-n-paste your patch. Documentation/email-clients.txt hints about configuring your e-mail client so that it sends your patches untouched git send-email Exception: If your mailer is mangling patches then someone may ask you to re-send them using MIME.
  • Slide 31
  • 8/16: E-mail size If uncompressed size > 300kB Store your patch on an Internet-accessible server, and provide instead a URL (link) pointing to your patch Acceptably SLOC for a patch < 100 lines
  • Slide 32
  • 9/16: Name your kernel version Make patches against linux-next or Linus's tree It is important to note, either in the subject line or in the patch description, the kernel version to which this patch applies. If the patch does not apply cleanly to the latest kernel version, the subsystem maintainer will not apply it.
  • Slide 33
  • 10/16: Don't get discouraged. Re-submit. After you have submitted your change, be patient and wait. If your patch is dropped, it could be due to: Your patch did not apply cleanly to the latest kernel version. Your patch was not sufficiently discussed on linux-kernel. A style issue. An e-mail formatting issue. A technical problem with your change. There are tons of e-mail, and yours got lost in the shuffle. You are being annoying. When in doubt, solicit comments on linux-kernel mailing list.
  • Slide 34
  • 11/16: Include PATCH in the subject Prefix your subject line with [PATCH] Due to high e-mail traffic to Linus, and to linux-kernel To distinguish patches from other e-mail discussions If you want to send a patch series --thread option: to make threaded mails --cover-letter option: to prefix with [PATCH 00/14] If patches are modified and re-submitted: Prefix with [PATCH v2]
  • Slide 35
  • 12/16: Sign your work The sign-off: A simple line at the end of the explanation for the patch, to certify that you wrote it or otherwise have the right to pass it on as an open- source patch The rules are pretty simple: if you can certify the Developer's Certificate of Origin 1.1 (see next slide for DCO), then you just add a line saying Signed-off-by: Random J Developer Using your real name No pseudonyms or anonymous contributions For a subsystem or branch maintainer sometimes you need to slightly modify patches you receive in order to merge them (If you stick strictly to DCO) Signed-off-by: Random J Developer [[email protected]: struct foo moved from foo.c to foo.h] Signed-off-by: Lucky K Maintainer Note that under no circumstances can you change the author's identity (the From header), as it is the one which appears in the changelog.
  • Slide 36
  • Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.
  • Slide 37
  • Special note to back-porters To facilitate tracking, insert an indication of the origin of a patch at the top of the commit message Example 1 in 2.6-stable: Date: Tue May 13 19:10:30 2008 +0000 SCSI: libiscsi regression in 2.6.25: fix nop timer handling commit 4cf1043593db6a337f10e006c23c69e5fc93e722 upstream Example 2 in 2.4: Date: Tue May 13 22:12:27 2008 +0200 wireless, airo: waitbusy() won't delay [backport of 2.6 commit b7acbdfbd1f277c1eb23f344f899cfa4cd0bf36a]
  • Slide 38
  • 13/16: When to use Acked-by: and Cc: Signed-off-by: The signer was involved in the development of the patch Or he/she was in the patch's delivery path Acked-by: The acker has at least reviewed the patch and has indicated acceptance He/she can acknowledge just the part of the patch Patch mergers will sometimes manually convert an acker's "yep, looks good to me" into an Acked-by: Cc: A person (in potentially interested parties) has had the opportunity to comment on a patch, but has not provided such comments The only tag which might be added without an explicit action by the person it names
  • Slide 39
  • 14/16: Using Reported-by:, Tested-by: and Reviewed-by: Reported-by: To credit the reporter for their contribution Please note that this tag should not be added without the reporter's permission, especially if the problem was not reported in a public forum. Tested-by: To inform maintainers that some testing has been performed To provide a means to locate testers for future patches To ensure credit for the testers Reviewed-by: The patch has been reviewed and found acceptable according to the Reviewer's Statement (see next slide) To give credit to reviewers and to inform maintainers of the degree of review which has been done on the patch Any interested reviewer (who has done the work) can offer a Reviewed-by tag for a patch.
  • Slide 40
  • Reviewer's statement of oversight By offering my Reviewed-by: tag, I state that: (a) I have carried out a technical review of this patch to evaluate its appropriateness and readiness for inclusion into the mainline kernel. (b) Any problems, concerns, or questions relating to the patch have been communicated back to the submitter. I am satisfied with the submitter's response to my comments. (c) While there may be things that could be improved with this submission, I believe that it is, at this time, (1) a worthwhile modification to the kernel, and (2) free of known issues which would argue against its inclusion. (d) While I have reviewed the patch and believe it to be sound, I do not (unless explicitly stated elsewhere) make any warranties or guarantees that it will achieve its stated purpose or function properly in any given situation.
  • Slide 41
  • 15/16: The canonical patch format The canonical patch subject line (see next slide) Subject: [patch 2/5] ext2: improve scalability of bitmap searching Subject: [PATCHv2 001/207] x86: fix eflags tracking The canonical patch message body A "from" line specifying the patch author. An empty line. The body of the explanation, which will be copied to the permanent changelog to describe this patch. The "Signed-off-by:" lines, described above, which will also go in the changelog. A marker line containing simply "---". Any additional comments not suitable for the changelog. The actual patch (diff output).
  • Slide 42
  • The Subject line format Subject: [PATCH tag] subsystem: summary phrase The tag Version descriptor: "v1, v2, v3 Request for comments: "RFC Sequence number: 1/4, 2/4, 3/4, 4/4 zero-padded: to sort the emails alphabetically (and numerically) by subject line The "subsystem To identify which area or subsystem of the kernel is being patched The "summary phrase A globally-unique identifier for that patch what the patch changes why the patch might be necessary Not be a filename Not use the same one for every patch in a whole patch series No more than 70-75 characters git log --oneline
  • Slide 43
  • The from line and marker line The "from" line From: Original Author Must be the very first line in the message body If the "from" line is missing, then the "From:" line from the email header will be used to determine the patch author in the changelog. The marker line "--- To serve the essential purpose of marking for patch handling tools where the changelog message ends Any additional comments could be followed: Diffstat: to show what files have changed, and the number of inserted and deleted lines per file Patch changelogs: to describe what has changed between the v1 and v2 version of the patch
  • Slide 44
  • 16/16: Sending "git pull" requests The proper format Please pull from git://jdelvare.pck.nerim.net/jdelvare-2.6 i2c-for-linus to get these changes: Do write the git repo address and branch name alone on the same line To generate the diffstat git diff -M --stat --summary the -M enables rename detection the summary enables a summary of new/deleted or renamed files git request-pull
  • Slide 45
  • Summary for submitting patches (1) Patch current mainline from kernel.org or linux-next or target-specific trees Send patches to subsystem maintainer, driver maintainer, & mailing list Each patch (re-)submission should include feature justification and explanation, not just the patch Use the DCO (Signed-off-by: Your Name [email protected]) Patches should be encapsulated (self- contained) as much as possible, not touching other code (when that makes sense)
  • Slide 46
  • Summary for submitting patches (2) ONE patch per email, logical progression of patches, not mega-patches, not attached and not zipped (cannot review/reply) Don't do multiple things in one patch (like fix a bug and do some cleanup) Check your email client: send a patch to yourself and see that it still applies (doesn't damage whitespace, line breaks, content changed) before going public with it Patch must apply with 'patch -p1'; i.e., use expected directory levels Don't use PGP or GPG with patches, they mess up patch scripts
  • Slide 47
  • Communication methods
  • Slide 48
  • Communications Communicating is hard, let's go shopping Writing ideas/thoughts down is good (but too wordy may be ignored) Participate constructively Mailing lists & archives (newsgroups) Working in open/public (technical readers/writers) vs. embarrassment Discussion and decisions on lists, no meetings required Work through concensus (with exceptions) Project web pages, IRC channels Developer conferences
  • Slide 49
  • Mailing List Etiquette Use Reply-to-All, threaded (Message-ID, References) > > Try A or B. > I prefer A, sound OK? yes Be prompt with replies (being responsive is important) No encoded or zipped attachments (inline preferred, text/plain attachments OK); others are often ignored No HTML or commercial email, no auto-replies (OOO/vacation) ALL CAPS == SHOUTING; rude, don't do it Use < 80-column width lines (70-72 is good) for text (not for patches)
  • Slide 50
  • Mailing List Etiquette (2) Keep it technical and professional. If attacked (flamed), stick with technical points, don't get involved with attacks, & move on. Trim replies (body) to relevant bits (don't modify To:/Cc: recipient list). Don't cross-post to closed mailing lists. Non-English speakers Resources: http://www.arm.linux.org.uk/armlinux/mletiquette.php http://www.arm.linux.org.uk/armlinux/mletiquette.php RFC 1855: Netiquette Guidelines: http://www.ietf.org/rfc/rfc1855.txt http://www.ietf.org/rfc/rfc1855.txt
  • Slide 51
  • No top-posting A: http://en.wikipedia.org/wiki/Top_post Q: Where do I find info about this thing called topposting? A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? A: No. Q: Should I include quotations after my reply?
  • Slide 52
  • Some Good Terms to Use Simpler Deletes N lines of code Faster (with data) Smaller (with data) Here's the code.... Series of small patches.... Tested... (how many configs) Builds on 8 architectures
  • Slide 53
  • Some Best Known Practices
  • Slide 54
  • Some Best Known Practices (1) Send patches directly to their intended maintainer for merging (they don't troll mailing lists looking for patches to merge) Copy patches to the appropriate mailing list(s), not private (don't work in isolation) Subscribe to relevant mailing lists (or use one representative for this) Listen to review feedback and promptly respond to it
  • Slide 55
  • Some Best Known Practices (2) Some maintainers do not acknowledge when they merge a patch; you just have to keep watching Use correct 'diff' directory level (linux/ top-level directory) and options (-up) Use source code to convey ideas Generate patch files against the latest development tree branch (-rcN) or mainline kernel if there is no current development branch Make focused patches or a series of patches, not large patches that cover many areas or that just synchronize a (CVS) repository with the kernel source tree
  • Slide 56
  • Some Best Known Practices (3) Use an email client that supports inserting patches inline (not as attachments) Begin with small patches: use kernel-janitors mailing list e.g. For larger patches or complete drivers or features, use the kernel-mentors mailing list (for beginner feedback, comments and corrections) Don't post private email replies to a public mailing list (without permission) Don't introduce gratuitous whitespace changes in patches
  • Slide 57
  • Some Best Known Practices (4) Back up your patch with performance data (if applicable) Don't add binary IOCTLs unless there are no other acceptable options; use sysfs (/sys) or private-fs or debug-fs or relayfs or netlink if possible Make Linux drivers that are native Linux drivers, not a shim from another OS Don't introduce kernel drivers if the same functionality can be done reasonably in userspace Try to be processor- and distro-agnostic (except for CPU-specific code) Don't be afraid to accept patches from others
  • Slide 58
  • Some Best Known Practices (5) Keep your patch(es) updated for the current kernel version Resubmit patches if they are not receiving comments Release early, release often Open, public discussion on mailing lists One patch per email Large patches should be split into logical pieces and mailed as a patch series Make testing tools available & easy to use; your device(s) will get better testing
  • Slide 59
  • Getting involved (Andrew Morton gives this advice for aspiring kernel developers, http://lwn.net/Articles/283982/ )http://lwn.net/Articles/283982/ The #1 project for all kernel beginners should surely be "make sure that the kernel runs perfectly at all times on all machines which you can lay your hands on". Usually the way to do this is to work with others on getting things fixed up (this can require persistence!) but that's fine - it's a part of kernel development.
  • Slide 60
  • Getting Involved in Kernel Development Testing, feedback/results Learn some basics at http://kernelnewbies.orghttp://kernelnewbies.org Find some small tasks that are identified at http://kernelnewbies.org/KernelJanitors http://kernelnewbies.org/KernelJanitors Focus on an area that you are interested in (many to choose from) Can just fix compile/build/sparse problems as an introduction Fix bugs in the bugzilla database at http://bugzilla.kernel.org http://bugzilla.kernel.org Add to kernel documentation
  • Slide 61
  • Reference Linux-mentoring Linux Kernel Development: Working in the Community: Social/Cultural Engineering Issues By Randy Dunlap [email protected] various versions at: http://www.xenotime.net/linux/mentor/ http://www.xenotime.net/linux/mentor/