Statistics help needed

Well, it seems I bit off more than I can chew with my recent poll on Web browser window size. I now have, thanks to the help of a clever Ruby script Frode Danielsen sent me, a CSV file (results.csv) containing information about 1385 different setups. But I cannot for the life of me figure out how to get anything useful out of it.

I am a complete spreadsheet illiterate, but I gave it a try. Opening the file and getting the data into three different columns is easy. But what then? How can I turn that into a few nice pie-chart diagrams? I have spent (too) many hours on this, looking through the completely useless help in Excel, googling for help, and throwing things across the room in frustration.

I just don’t get it, and am about to give up. Are there better applications than Excel or OpenOffice Calc for this kind of thing?

If you know how to turn the data in the CSV file into something useful, feel free to do so. I would also really appreciate some pointers to help with this. Not understanding how to do something that should be really simple is extremely frustrating.

Update: Thanks for helping me out, everybody! I think I have what I need, so I don’t want to occupy any more of your time with this :-).

Update (2007-04-15): I have published the results of the poll in Poll results: 50.4% of respondents maximise windows.

Posted on April 14, 2007 in Usability


  1. I believe you would want to count each one of the items, and then use the results from that to make a pie graph. I’ll run it through Excel real quick and see what I can come up with.

  2. I’ve put in some formulas to count how many of each item there were.

    The data looked to plug into a pie graph rather nicely.

    There was a lot of different fields, and I’m not sure how you wanted each consolidated, so I didn’t touch anything else. There were a few resolutions that only returned one, which would probably be better grouped under ‘Other’.

    But I’ll leave that to you. :)

    It may require you to manually add up some numbers, though, but that shouldn’t be near as hard as counting 1385*3 fields.

    Hope it helped.

    The file is at:

  3. April 14, 2007 by Roger Johansson (Author comment)

    Wow, that was quick, and many many thanks - that looks very useful! Like you say I will want to manually do some consolidation, but this is a good start.

    Thanks again. I don’t know why this was so hard for me to do myself, but..

  4. Looking at Jeremy’s file, there are also a few results that were probably typos, or didn’t get gathered correctly by the script. Like, there’s “Mac OS X”, “Mac OSX”, and “Max OS X” as three separate entries. Those are all supposed to be the same (probably), but they end up being read as different systems.

    Or 1024x768 vs 1024x786. I’m guessing those are supposed to be the same resolution, just got a visit from the typo fairy.

  5. The best way to analyze the data in Excel is to use a Pivot Table, but they can be tricky.

    On the other hand, if you know what you’re doing they can be quite quick.

    See for my attempt.


  6. April 14, 2007 by Roger Johansson (Author comment)

    Tigerblade: Yep, there are a few typos that I missed while cleaning up the data.

    Kevin: I had a feeling that a pivot table would be the way to go, but I couldn’t figure it out. Thanks for the help! Now I have a couple of different files to learn from, and I should be able to use them to present the results nicely.

  7. Jeremy, that WAS quick! Roger, let me know if you’d like any additional help. I’ve been heavily involved in stats like this for the past few months, from a few personal studies. However, I won’t bother downloading the csv if you’re already content.

    Looking forward to seeing the data and some trends from the data! The poll looked very successful from all the responses.

  8. I have some numbers for you, but they aren’t very pretty right now.

    For example,

    MAXIMIZED: False - 687 (49.6%) True - 698 (50.4%)

    RESOLUTION: 1024x768 - 159 (11.5%) 1280x1024 - 450 (32.5%) 1280x800 - 110 (7.9%) …etc

    OS: Mac OS X - 450 (31.9%) Ubuntu - 77 (5.6%) Windows 2000 - 30 (2.2%) Windows XP - 691 (49.9%) Windows Vista - 46 (3.3%) …etc

    I have a plethora of more numbers if you’re interested drop me an email :)

  9. My girlfriend helped me out with her Excel skills. We came up with this graph: It shows percentage of maximisers by resolution and OS.

    Our analysis of the results, based on the graph, is:

    1. The greater the screen resolution you have, the less likely you are to maximise your browser window.

    2. Mac OS users are much less likely than Windows, Linux and BSD users to maximise their browser windows.

  10. In case you want to see the spreadsheet we used to make that graph (we simplified the data a great deal) it’s here:

    Excel 2007 needed, sorry!

  11. ^^ Your direct link doesn’t work, it thinks it’s a hotlink.

  12. April 15, 2007 by Proud

    Why bother with Excel? Just use Keynote! Open up a slide, click the “Charts” button up top, and BAM! a chart editor. (And it’s so much sexier too.)

  13. April 15, 2007 by Proud

    Oops, I didn’t realize the data was largely unprocessed. Never mind then!

  14. I’ve made one two, if you want to try it it’s at:


    My gf help me too since web designers can’t excel :)

  15. Roger, I think you asked for the wrong data, or for not enough. The screen size can only tell you the maximum available window size, and has no direct relationship to the window size in use. You should ask for the browser window size in use. See this on actual browser sizes if you haven’t already.

    Second, I think it would help to separate the width from the height. It is the width that’s of primary interest, is it not? I ran a simple sed replacement command against the csv file.

    gt@koko:~/456$ sed “s/x/,/’ < 456.csv > 456a.csv

    Now you have a four column table and can look at width alone.

    Some points, interesting and otherwise:

    The median screen size is 1280×1024

    Slightly more than half (698 of 1385) run their browsers maximized.

    Of those, 572 have screens wider than 1024, thus wasting an amazing amount of desktop. (OK, that’s editorializing)

    Of the folks who don’t run maximized, there is nothing of value to glean.



  16. This post brings to mind the useful reporting that Google Analytics does. Here’s an example from the past week: Screen Resolution Graphs

  17. April 15, 2007 by Roger Johansson (Author comment)

    Rik, Marcelo: Unfortunately I don’t have Excel 2007 (and I doubt I ever will since I’m a Mac user), so I can’t open your files. Thanks anyway :-).

    Gary: Yep, I’m aware of the screen size vs. actual window size difference. I figured it would be much easier for people to just tell me their screen size.

    I have actually started to do some cleaning to look at screen width only, so thanks for that sed replacement command.

  18. Here is my bit, It could be much better with time not to sure how you wanted to use the data. I have no done this for long time…

    Here is the Execl sheet tp download. hope it helps…

  19. @Roger:

    Excel 2007? Well, I think Microsoft right now works on the next Office for Mac, so don’t worry:) In worst case, ask someone to convert the file for thee, it’s not that hard;-)

    (Sorry, can’t help, I am hopeless WinXP user up to now;-)

  20. Get up to speed on Excel in a flash with Microsoft Office Excel 2003 for Windows (Visual QuickStart Guide)

    Task oriented and direct!

  21. May 3, 2007 by Chris

    A very easy way to handle data is through a database. ie- Access. You can count instances, sum values, etc.

    I looked at the .csv file but I didn’t know what the first column of true/false indicated so I couldn’t really play.

    Hope this helps -C

Comments are disabled for this post (read why), but if you have spotted an error or have additional info that you think should be in this post, feel free to contact me.