Android open source



[music playing] don dodge: hi, i'm don dodge,and this is "google root access." today we're going totalk about open source-- how google uses open source, andhow google contributes to open source.


Android open source, my guest today is chris dibona,who runs the open source projects for google. welcome, chris. chris dibona: thanks, don.


don dodge: so tell me, whatdo you do at google? chris dibona: so very basically,if there's open source at google it's myjob and my team's job to look after it. and that includes all the codewe might use when creating sites like google.com, andgmail, and all the docs and spreadsheets and appsthat we have online. but also all the thingsthat we ship-- things like androidand chrome.


and if you have a nexus device,or if you're one of the people who is a google glassbeta tester, you'll be running open sourcecode on there. whether as part of theproprietary software running on top of it, or as a wholeopen source ecosystem like android and chrome. so we make sure that we shipthings within compliance of those licenses, but also so thatthose teams have correct infrastructure as well.


so we look after all kinds ofinfrastructure for android and some of the otherteams right now. don dodge: so some people maynot know this, but google's roots are actuallyin open source. google built its own serversand used linux, and hacked linux to do very interestingthings. can you tell us a littleabout that? chris dibona: sure. well, actually, if you look backto when larry and sergey


were at stanford and buildingbackrub, which of course became google. that was all written in pythonand using some fairly standard web serving tools and libraries,and they put the uniqueness that wasgoogle on that. so you had python running on topof linux, running on top of all kinds and allmanner of hardware. and as we grew up as a company,we kept that ethos where we would get what a lotof people would consider


fairly trashy hardware, and wewould get our reliability from the software layer. and that was linux, and thatwas python, and that was eventually java andc and the rest. and so we started with opensource in a lot of ways and built on top of it, and it'sbeen a pretty healthy code system since then. don dodge: so google not onlyuses open source and used it to build the base of google, theinfrastructure of google,


but google also contributesto open source? chris dibona: yeah, to datewe've conservatively estimated that we've contributed fromgoogle developers about 50 million lines of code,give or take. and that spans everything fromandroid and chromium, all the way through to compilers. for instance, if you have anyconsumer electronics device that's been made in the lastfive years, you're probably running some google code on itand you don't even know it,


through the libraries that we'vecontributed to and just the vast number of projectswe've released, so it's fairly shocking. and this is not even takinginto account the vast proliferation of android, andthe technology inside android. don dodge: so the infrastructurethings-- bigtable and hadoop. tell us a little bit about howthat evolved and what are some of those open source projectsthat either came out of


bigtable in google'sinfrastructure, or were modeled after that. chris dibona: yeah, so if youlook at hadoop, we published a paper back in 2000, i want tosay 2002 or 2003, about the google file system whichis called gfs. and it inspired folks like dougcutting to go out and write open sourceimplementations of this technology. and that happens with a lotof the things that we do


publications about, butdon't actually ship as open source software. so what you see coming fromgoogle is often, if we talk about it, people will want it. and if we don't release itas open source, they'll try to create it. that's actually reallyhealthy. and so you've seen aproliferation of these kinds of big data tools like mongodband redis and leveldb and all


these really interestinglarge-scale key value stores and stuff. and they have their roots in uspublishing papers about our internal systems like bigtableand ngfs and spanner and the rest. don dodge: you mentionedchromium, android, and some other pretty major open sourceproducts that google has. are they licensed the same andthey are basically the same? or are they quite differentin how we handle them?


chris dibona: so chromiumand android are actually quite different. if you look at chromium, whenwe wrote it we really wanted to inspire other browsermanufacturers to, frankly, do what we do. and so we wanted to make itpossible for them to actually use our code. so beyond just contributingto webkit, we also made a conscious decision to licensechromium under bsd.


because that code would thenbe consumable by not just webkit-based browsers, but alsothose of them that might use gecko, mozilla. even microsoft could take thatcode and pull it into their browser or their software, andnot just learn, but actually just use what we'republishing. because when we shippedchromium-- this is something a lot ofpeople don't understand-- when we first shipped to chrome andchromium, this was a time when


one tab getting slow would takedown your whole browser. and we weren't really takingadvantage of the full capabilities of a moderncpu or modern hardware. and so we really wanted to bringthat kind of processed isolation, and also the securityin a sandbox to the browser so that people couldfeel the browser was a safe place to work and to consumeinformation. so we tried to convince otherbrowser manufacturers that this was really important.


and we decided that it was timeto just do it, and do it in such a way that they could,if they ended up agreeing with us later, just adopt the work. and we've been moderatelysuccessful with that later mission. but i think chromium has beenvery successful at showing that you can take real advantageof processed isolation and sandboxing tosecure a web browser. and i think that's reallyimportant, and people miss


that all the time. now, android's a differentstory together. with android, we licensedandroid under the apache license with the kernel beingthe gpl linux kernel. and we chose apache fora number of reasons. apache has this really awesomelanguage around patents. so it says, hey, if you're usingour software and we have patents that read on thatsoftware, we're not going to come and rent seek againstyou for that software


unless you sue us. so if you sue us, we're goingto retain the rights to use those patents in the defenseof our company against you. but otherwise, we're telling youthat you have nothing to fear from google andfrom using android. and that was a really big deal,and i think it really mattered with the carriers. also, apache has this unusualaspect to it in the world of open source where it says wejust want you to acknowledge


you're using it. we don't need you to shareyour source code, too. and for the carriers who havea lot of balls that they juggle in the air at one time,making compliance easier for them is very important. and i think it has a lot to dowith why android is as popular as it is with so manymanufacturers, and you know that it's a real opensource project. so companies will oftensay hey, i've got


this open source thing. and then you'll see that they'rethe only ones running it, and the only ones whowill ever run it. and it comes down to trust,and it comes down to what license they use. i think it's extremely tellingthat when you look at a kindle, or you look at a nook,or you look at half a dozen android-based devicesthat don't have anything to do with google.


that they continue to persistentand exist, because that's really open source. we can't stop them, andnor do we want to. but even if we wanted to, wecouldn't stop them from using the software. and that's how you know it'sreally open source. don dodge: so i sort ofunderstand how open source works and what it is. but i'm confused about thisjumble of licenses, there's an


alphabet soup of licenses. chris dibona: you'renot alone. don dodge: can you say a littlebit about what are the major licenses anddifferences? chris dibona: well, there'sthree major categories of licenses. there's what we call thepermissive licenses, and they basically say, here's somecode, don, enjoy it. and all we really ask from youis that you mark in your about


box or in your documentationthat you're using it. so those are the permissive/notification licenses. don dodge: what are someexamples of the [inaudible]? chris dibona: so that would bebsd, that would be apache, mit, ones like that. and then there are the moreshare alike style licenses. and they say if i give mysoftware to don and don ships something, well, thenhe's obligated to


share his code, too. for some definition ofshare and use and all the rest, of course. but those are your gpl licenses,your lgpl licenses, and to a lesser degree the mpllicenses as part of the mozilla tri-license. so depending on which one it is,you'd either have to share more or less code, depending onhow you use that software. and this stuff is deeplyconfusing, so you shouldn't


feel like you're ignorantof this stuff-- almost everyone is. and it's not just a jobs programfor me and people like me, it's just complicatedstuff. because these licenses, theysay how people feel about sharing their software. and when you talk aboutfeelings, things get confusing and complicated. and then the third category oflicenses are the ones that are


trying to also define networkperformance as being the kind of sharing that invokes theresharing of software. so there are very few of these,and they're applied to very few actual piecesof software. but it's a version ofthe gpl called the affero gpl, or the agpl. and so, for instance, we don'tuse any agpl software at google and we don'tallow it here. and that would be likemongodb, and


rstudio and the rest. and so in those cases, we eitherpurchase a commercial license from those softwarecompanies or we simply don't allow them. don dodge: so the agpl, wedon't use that because--? chris dibona: because as itinteracts with other software and then is displayed on, say,a google.com site or on a gmail or apps site, we wouldthen have to share more software than we're comfortablesharing through


the linking to that software. don dodge: so if i'm a startupand i'm building my company, what do i need to look out forin terms of using open source? and which license should i have,and that kind of thing? chris dibona: well, except forthe agpl, you pretty much don't have anything to worryabout as, say, a web startup. if you're going to be shippingsoftware, that's a somewhat longer conversation where youtry to suss out what the startup wants to do, whatthey're shipping, what they


want to share. because open source is, in mymind, also a fundamental part of a developer relationsstrategy and a developer ecosystem. without it you don'thave one, or it's a very unfair one, frankly. so what i tell startups when iadvise them is that when you bring software into the companythat isn't written by someone at the company, andthat means anything--


examples, software from placeslike github, or stack exchange, or stack overflow-- you should always put these ina special directory, like a third party directory. this is what we do atgoogle, by the way. we have millions ofthese directories all over the place. segregate the code-- you can absolutely,unconditionally still use it,


you just want to keep itout in a special place. don dodge: in your code baseyou actually segregate the open source code andyou keep it there? chris dibona: yeah. and so you have these thirdparty directories and then you call that code in whatever wayyou see fit, but then at least you know where it is. so there are other hygienethings you can do, like make sure you only have one versionof a specific library, even if


it's used across multipleproducts. keeping up to date with security[inaudible], all these things that arefundamental to development. but keeping things separateenough that you can keep on identifying it ispretty amazing. if you have that, you'rebetter than 90% of the companies out there. don dodge: so that'srule number one? chris dibona: that'srule number one.


you have to be carefulwith the agpl. so for instance, a lot of peoplereally like mongodb right now for what it does, andthat makes perfect sense. and if you look at how mongodbis licensed and the way they articulate that license,it's very clever. they have very smart clientlicensing, which basically doesn't transfer that networkobligation onto a website. but the database itself issomething you have to, if you make improvements to thedatabase, if you have certain


code interacting with thedatabase in certain ways, that code may end up being shared. so you'll just want to know thatwhen you're using tools like mongodb. magento is another reallypopular shopping cart application. and it's not unfair to saythat this drives a lot of license revenue to magento andmongodb, and it's fair and this is their deal.


they're saying listen, we wantyou to use this stuff. but if you use it in certainways, you should be paying us. and so that's a commercialsoftware license that you would get from them. so that's the sort of thing wesee in acquisitions actually quite a bit. don dodge: good point. you also review acquisitionsfor google. every time google does anacquisition, you look at the


code and look for open source,make sure it's licensed. so say a little bit about whatyou do in that process. chris dibona: yeah, so forinstance, suppose you're a startup that does somethingwith video. you're probably using ffmpeg,you're probably using it with certain compile flags. and most people don't evenknow this, by the way. at this point, by the time isay compile flags, it means there's a developer inside thatorganization who built


that piece of software. and i can actually tell fromthe output that i, the consumer, can see prettymuch what you've done. so it's very easy for me to say,ok, you're probably using this, this, and this. and they're like, who toldyou what we're using? and it's like, we know. we've seen a million startupslike this, and so here's what you'll have to do to becompliant with these licenses.


so we have that discussion alot with companies, by deal close you have todo x, y, and z. and it's funny, when i wasyounger and maybe more idealistic or less cynical, iwould say, i can't believe these people are breakinglicenses in such a fundamental way. but what it is is most people'sintentions are pure. they just don't know how to becompliant with these licenses, and sometimes it's actuallydeeply complicated.


so oftentimes when we seeacquisitions come in, we'll say, ok, if you want to keepshipping this piece of software, you have to do x andy and z. and you've been shipping this piece of softwarefor this long, so you'll need to do this. so that people who want theirrights under the open source licenses which you have chosento use are going to be able to be satisfied. because we like to in someways over-comply.


put up mirrors of code sothat people can find the information they want about opensource in your product simply by googling for it. it saves us a lot of time thanhaving to answer emails and sit on mailing lists and worrya lot about things. so we try to bring people intoour level of compliance, which is very high, by thetime a deal closes. we've never had to scuttle a an acquisition because of this.


i've seen acquisitions fallapart, and this might be a component of that, butit's very rare. it's more that bringingdiscipline to your code base for open source really hashigher implications about the quality of the codeof the company. don dodge: overall? chris dibona: yeah, exactly. so it's extremely rareto find somebody who understands this stuff.


what's the right wayto say this? if someone has a good, tidycode base, then they're usually also either verycompliant with open source licenses already, or easilyable to become compliant. if they're not compliant withopen source licenses and they have an untidy code base, thatcould have implications, too, during an acquisition that havenothing to do with the open source license complianceand everything to do with how they develop and how theybring software together.


it's more of an indicator. don dodge: so if a startup isusing mongodb, or magento-- chris dibona: which a lot are. it's a great tool. they're both great tools. don dodge: so whathappens then? we just don't consider usingthat kind of software? chris dibona: in rare cases wehave a pool of licenses for them to draw upon, but wekeep that very limited.


because usually when people cometo google, they end up transferring onto our technologyor we're shutting their technology off. it depends on the situation. it's very hard to say we alwaysdo x or we always do y. because we want to remainflexible for these kinds of things. there are some companies, andi'm not going to name them, where if you have gpl softwarethey will simply force you to


rip it out before they'll eventalk to you, much less let the deal close. so it's rare nowadays, butit certainly happens. and i'm not talking aboutthe linux kernel. everyone's comfortable withthe linux kernel. it's sort of an acceptedthing that they have to deal with that. but for some gpl tools andsystem libraries and web frameworks and stuff,they don't want


to have that around. don dodge: well, that's reallya competitive advantage for google and it has beenfrom the beginning. chris dibona: i wouldconsider it so. yeah. for instance, if you reject allgpl and lgpl code, you're rejecting 75% of the open sourcecode that's out there. and we're talking, i think itwas probably 35 million unique files out there encompassingmaybe 5 billion lines of code.


so you would be rejecting 75%of that just by saying, i don't want to deal withthe gpl and the lgpl. again, it's a finedecision to make. i think it's very limiting. don dodge: so google roots arein open source and we use open source, we contributeto open source. chris dibona: we create new opensource developers with the summer of code. i mean, we do a lot.


don dodge: say more aboutthe summer of code. you've been writing that programfor quite a while. chris dibona: yeah, we'recoming up on our ninth year, i believe. and what that does is itbasically introduces students to open source software. it sets up a one-to-onementoring relationship between open source software projectsand students. so the idea was larry page wasuncomfortable that people were


leaving college for the summerand then backsliding over the summer because they had togo get a job that wasn't computer-related. so he was like, if you could fixthat, that would be great. and i was like, ok. so we came up with thesummer of code as a way of fixing that. and the idea was we would givepeople real world, real bug, real problem experiencein open source.


and it would also expose themto the world of open source development. i think that's beenan incredible good for computer science. we've only taken about10,000 students through the program now. don dodge: 10,000? that's a lot. 10,000, 9,000, yeah,about that.


and we pay them over thesummer to do this work. don dodge: so they actuallyget paid to do this? we spend a fair amountof money. and they on average write about3,000 lines of code that ends up into opensource software. so you can do the maththere, too-- it's a significantamount of code. when you're a computerscience student-- i don't know if you went throughcs school-- but you're


taught a lot of theory and a lotof comparative programing languages and algorithmsand that kind of thing. you're not taught whathappens when a user uses your software. you're not taught what happenswhen they decide to run it on some horrible pieceof trash computer that you never expected. and open source softwareprojects are exposed to that kind of real world usageall the time.


and so you get a whole differentview into how to make a software program work andwhat users need by working on open source. and additionally, this is nota we bring them into google and host them kind of program. this is a wherever they areall over the world. we've done this in93 countries now. don dodge: really? chris dibona: and payingpeople in 93


countries is not easy. but we've had students in prettymuch any country you can think of in themodern world. well, perhaps there's morenow that there's 273 countries in the un. it's been pretty remarkable tosee how these students turn from these neophyte programmers,really, into pretty amazing remoteprogrammers, which frankly is something that industrystruggles with.


having remote developers isextremely difficult, and open source seems to figure it out. and what it comes down to,really, is the people who are good at that kind of thing aregood at open source, and the people who aren't, aren't. so we had the failure mechanismin the summer of code as well where we probablyfail about 14%, 15% of the students every year outof the program. and that means theydon't get paid at


some step of the program. so maybe they'll get theacceptance fee, but not the mid-term or the final payment. so it's very merit-based whichis also, i think, a key thing about open source. you either do a good job oryou're not part of it. don dodge: fascinating. well, thank you. thank you for being here todayand for your contributions to


open source at google. chris dibona: absolutely. don dodge: it's terrific. chris dibona: thanksfor having me. don dodge: thankyou very much.


Android open source

thank you for joining us. and be with us next time on"google root access," when we'll be talking about howstartups can use open source. thank you.



Android open source Rating: 4.5 Diposkan Oleh: PaduWaras