Oct 28

I'm in no way trying to conflate this with the meaning of my last blog post, but after a six month gestation, we just gave birth to a public website.

Stack Overflow: none of us is as dumb as all of us

Of course, I'm making a sly little joke here about community, but I really believe in this stuff. Stack Overflow is, as much as I could make it, an effort of collective programmer community.

Here's the original vision statement for Stack Overflow from back in April:

So what is stackoverflow?

From day one, my blog has been about putting helpful information out into the world. I never had any particular aspirations for this blog to become what it is today; I'm humbled and gratified by its amazing success. It has quite literally changed my life. Blogs are fantastic resources, but as much as I might encourage my fellow programmers to blog, not everyone has the time or inclination to start a blog. There's far too much great programming information trapped in forums, buried in online help, or hidden away in books that nobody buys any more. We'd like to unlock all that. Let's create something that makes it easy to participate, and put it online in a form that is trivially easy to find.

Are you familiar with the movie pitch formula?

Stackoverflow is sort of like the anti-experts-exchange (minus the nausea-inducing sleaze and quasi-legal search engine gaming) meets wikipedia meets programming reddit. It is by programmers, for programmers, with the ultimate intent of collectively increasing the sum total of good programming knowledge in the world. No matter what programming language you use, or what operating system you call home. Better programming is our goal.

Although reaction has generally been positive, there has been a bit of backlash. Some have promoted the idea that Stack Overflow will only contribute to the increasing dumbenation of the world's developers. I think this is, in a word, horsecrap. I liked Joel's response to this in podcast 21 (mp3):

And it is true that we are all, as developers, hopelessly incompetent. The goal of a site like Stack Overflow is to somehow share the correct knowledge wherever it may be as it is scattered throughout the universe, and to cause that to be voted up and to be spread amongst us. There's this big universe of dumb programmers, and I'm one of them, and we all have a little bit of knowledge. I may know how to do this thing in VB6 which may be useful to somebody one day who's trying to maintain some ridiculously old piece of crap code. We all have these little tiny pieces of information and if we can just contribute a little bit, that information gets amplified, and maybe a thousand other dumb developers will benefit from my one little piece of good information.

And here's my response, from the same podcast episode, to all those who turn up their noses at community sites like this, preferring the input of "experts":

The idea that you have all these experts waiting in the wings to do stuff is an illusion in my experience. There's really just a bunch of amateurs muddling along trying to do things together. The people that are truly experts are too busy to even help, right? And if the experts are too busy to help, what difference does it really make if there are experts at all. Because the whole point of this endeavor is helping other developers, and whether you're an expert or not, if you have no time to help, you're not really contributing to the solution.

Stack Overflow is by no means done. We're still technically in public beta. But I believe what we have -- the confluence of wiki, discussion, blog, and reddit/digg ranking systems -- is a fair representation of our original vision for Stack Overflow.

venn diagram: wiki - digg/reddit - blog - forum

It's a place where a busy programmer can invest a few minutes with as little friction as possible, and get something tangible from the community in return.

But who cares what I think; my opinion holds no particular weight. I'm just a member. This is our site. You tell me: how dumb are we?

[advertisement] Peer Code Review. No meetings. No busy-work. Customizable workflows and reports. Try Jolt Award-winning Code Collaborator.


Tagi: legal search engine, stack overflow, fellow programmers, programmer community, experts exchange, programming knowledge, sleaze, sum total, mth, wikipedia, backlash, programming language, nausea, pitch, joke, operating system, developers, blogs, mp3

Oct 28
Where we stand...
Posted by George Hotz in software hack, bootrom, gunlock, comex, dmg, bbupdater, tokens, hacks, crap, patches, checks, open source on 10 28th, 2008| icon3
Ok, here is where we stand right now.

ZiPhone seems to be the tool a lot of people are using. What it does is boot an unsigned ramdisk with a script to jailbreak, activate, and unlock. If you would like to view the ramdisk yourself, cut the first 0xCC2000 from the dat file and mount it as a dmg. The script is in /etc/profile. Also, Zibri, patch out the bootloader check from gunlock, it'll work with 3.9

ZiPhone is a wrapper for gunlock, which means with 4.6, it currently only unlocks 4.02.13 In order to unlock 4.03.13, right now you need bootloader 3.9

gbootloader will erase and downgrade your bootloader from software. I have checks in the program to prevent a bootloader without the bootrom locations blank from being uploaded, but if used properly, it will downgrade to 3.9, allowing 4.03.13 to be used.

4.6_GEOMOD is a modified bootloader I have with all secpack stuff patched out, hard coded IPSF style unlock(tokens always validate), full anywhere write access, no startup sig checks, and the bootrom locations blank. But the only 4.6 phone I have got bricked while I was trying to restore the seczone, and my bootloader software hack doesn't seem to work in 3.9 I guess I'll have to hw upgrade. Laziness...

Another problem comes with the release of the modified bootloader. It is copyrighted, and the patches are decently complex. What I'd really like to see is an open source, very well coded(the current compiler is crap), bootloader. Say written in assembly. I believe a full bootloader with all the functionality(minus the security) can fit in under 0x1000 bytes. It should continue to work with bbupdater, but have the crypto state machine fixed to validate everything possible. Maybe I'll get around to writing it. This is the ultimate in baseband hacks, and will put every other hack to rest, once you get the new bootloader on there. I'm sick of patching and trying to understand other peoples(badly written) code, when I can just write my own.

Tagi: software hack, bootrom, gunlock, comex, dmg, bbupdater, tokens, hacks, crap, patches, checks, open source

Dec 4
When it comes to a more budget conscious, yet highly effective means to re-vamp a home and entice potential buyers, home-staging is a fantastic option. As we will see with this month’s Makeover, Accent on Design Inc. re-styled several utilitarian rooms and transformed them into inviting and luxurious spaces certain to lure hungry home [...]

Tagi: accent design, homestaging, opti, vamp, makeover, budget

Dec 4

URLs are simple things. Or so you'd think. Let's say you wanted to detect an URL in a block of text and convert it into a bona fide hyperlink. No problem, right?

Visit my website at http://www.example.com, it's awesome!

To locate the URL in the above text, a simple regular expression should suffice -- we'll look for a string at a word boundary beginning with http:// , followed by one or more non-space characters:

\bhttp://[^\s]+

Piece of cake. This seems to work. There's plenty of forum and discussion software out there which auto-links using exactly this approach. Although it mostly works, it's far from perfect. What if the text block looked like this?

My website (http://www.example.com) is awesome.

This URL will be incorrectly encoded with the final paren. This, by the way, is an extremely common way average everyday users include URLs in their text.

What's truly aggravating is that parens in URLs are perfectly legal. They're part of the spec and everything:

only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

Certain sites, most notably Wikipedia and MSDN, love to generate URLs with parens. The sites are lousy with the damn things:

http://en.wikipedia.org/wiki/PC_Tools_(Central_Point_Software)
http://msdn.microsoft.com/en-us/library/aa752574(VS.85).aspx

URLs with actual parens in them means we can't take the easy way out and ignore the final paren. You could force users to escape the parens, but that's sort of draconian, and it's a little unreasonable to expect your users to know how to escape characters in the URL.

http://en.wikipedia.org/wiki/PC_Tools_%28Central_Point_Software%29
http://msdn.microsoft.com/en-us/library/aa752574%28VS.85%29.aspx

To detect URLs correctly in all most cases, you have to come up with something more sophisticated. Granted, this isn't the toughest problem in computer science, but it's one that many coders get wrong. Even coders with years of experience, like, say, Paul Graham.

If we're more clever in constructing the regular expression, we can do a better job.

\(?\bhttp://[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]

  1. The primary improvement here is that we're only accepting a whitelist of known good URL characters. Allowing arbitrary random characters in URLs is setting yourself up for XSS exploits, and I can tell you that from personal experience. Don't do it!
  2. We only allow certain characters to "end" the URL. Ending a URL in common punctuation marks like period, exclamation point, semicolon, etc means those characters will be considered end-of-hyperlink characters and not included in the URL.
  3. Parens, if present, are allowed in the URL -- and we absorb the leading paren, if it is there, too.

I couldn't come up with a way for the regex alone to distinguish between URLs that legitimately end in parens (ala Wikipedia), and URLs that the user has enclosed in parens. Thus, there has to be a handful of postfix code to detect and discard the user-enclosed parens from the matched URLs:

if (s.StartsWith("(") && s.EndsWith(")"))
{ return s.Substring(1, s.Length - 2);
}

That's a whole lot of extra work, just because the URL spec allows parens. We can't fix Wikipedia or MSDN and we certainly can't change the URL spec. But we can ensure that our websites avoid becoming part of the problem. Avoid using parens (or any unusual characters, for that matter) in URLs you create. They're annoying to use, and rarely handled correctly by auto-linking code.

[advertisement] Read the largest case study ever published about lightweight peer code review in Best Kept Secrets of Peer Code Review. Free book, free shipping.


Tagi: pc tools central point, central point software, everyday users, parens, wikipedia, space characters, word boundary, coders, piece of cake, msdn, hyperlink, computer science, ly, microsoft, love

Dec 4

Wikipedia is one of the world's most visited web sites (8th in the top 10, in fact), delivering an enormous breadth of content to an audience as vast as the internet. But Wikipedia's evolved to become more than an on-line encyclopedia: they've become one of the world's largest search engines, they're a global source of real-time news, alongisde educational, political and health related content - and one of the world's most valuable brands and media properties.

Wikipedia's also a great example of a "redshift" application: a segment of the market that's growing faster than the technology industry's capacity to innovate. Technology companies have to pay special attention to such redshifted segments - not only do they eventually grow the overall market, but their innovation often drives the technology landscape. Broadly speaking, social media, from free news and social networking, to search and content sharing, is doing exactly that - defining new architectures and requirements for radical scale, economics and availability.

So I was really pleased that Wikimedia had chosen Sun's Open Storage platforms over proprietary alternatives, to help manage their evolution to rich media - bringing high quality video and time based content to their more than 250,000,000 users globally. That's a big audience waiting to upload - and interact with - high quality content.

Like Wikipedia, most of the planet's largest web sites (just look at the top 100) are built atop Sun's MySQL database. Which is why we've just introduced a line of systems platform designed specifically to run MySQL - at up to 3x the performance of whitebox alternatives (after all, it's far easier marketing to audiences that have already chosen Sun). We're now expanding those offerings with our newest Open Storage portfolio, as well - built to run ZFS from 5 to 50x traditional performance. And again, all such systems are available here for free trial - pick the system you want to try, we'll cover shipping costs to and from your site.

And while I'm on the topic of systems... I've been asked for insights into our recent software reorganization, in which we announced three main focus groups (a Systems group, an Applications group, and a Cloud group). Why'd we make that change?

First, look no further than this win for one of my main motivations: I'd like to enhance the value and alignment we offer to customers that want to run our system software (like MySQL and ZFS) at very high scale - and require, from Sun and our OEM partners, the tightest possible technical collaboration and alignment between hardware and software.

Second, this move amplifies the obvious (at least to us): the storage market will be larger than the server market, but you may not be able to tell - they're converging, built from the same systems software and hardware components (networking will follow the same path, more on that in the future).

Finally, adoption and software distribution/marketing is different than revenue generation. And with the adoption of ZFS well underway, technical and business alignment have become our dominant priorities. It's at the heart of what's fueling one of Sun's fastest growing businesses (ZFS based Open Storage was up more than 150% last quarter, growing far faster than our proprietary peers).

How large is the redshift opportunity? It's not just businesses like Wikipedia that are defining new scale requirements for the industry. It's the on-line bank I saw last week, now serving more than 100m accounts globally - contemplating the addition of video chat for customer service. It's the government customer I just visited trying to deliver driver's license and passport renewal services to hundreds of millions of its citizens. The term redshift describes applications, not customers (remember, even Wikipedia has payroll - not exactly a redshift application).

In an openly networked world, redshift applications begin to equate to social phenomena - and social phenomena don't respect your IT budget. Which is to say, neither a 10 person startup, nor a 10,000 person retailer want to go broke buying software licenses and storage, just because they've struck a chord with the planet. Which is increasingly why both sides of the industry are moving to open source.

And open storage.

________________

(And with apologies to the OpenOffice community - we are not going to be inserting ads into OpenOffice.org - we're creating partnerships to brand and promote StarOffice, and the cloud we're developing behind it.)


Tagi: real time news, open storage, technology landscape, ecomics, global source, line encyclopedia, social networking, anet, compas, redshift, search engines, mysql database, technology industry, wikipedia, architectures, breadth, free news, audiences, segment

next >