Statistics help needed

Well, it seems I bit off more than I can chew with my recent poll on Web browser window size. I now have, thanks to the help of a clever Ruby script Frode Danielsen sent me, a CSV file (results.csv) containing information about 1385 different setups. But I cannot for the life of me figure out how to get anything useful out of it.

I am a complete spreadsheet illiterate, but I gave it a try. Opening the file and getting the data into three different columns is easy. But what then? How can I turn that into a few nice pie-chart diagrams? I have spent (too) many hours on this, looking through the completely useless help in Excel, googling for help, and throwing things across the room in frustration.

I just don’t get it, and am about to give up. Are there better applications than Excel or OpenOffice Calc for this kind of thing?

If you know how to turn the data in the CSV file into something useful, feel free to do so. I would also really appreciate some pointers to help with this. Not understanding how to do something that should be really simple is extremely frustrating.

Update: Thanks for helping me out, everybody! I think I have what I need, so I don’t want to occupy any more of your time with this :-).

Update (2007-04-15): I have published the results of the poll in Poll results: 50.4% of respondents maximise windows.

  • April 14, 2007
  • Comments closed
  • Posted in

Comments

1. April 14, 2007 by Jeremy McCullough

I believe you would want to count each one of the items, and then use the results from that to make a pie graph. I'll run it through Excel real quick and see what I can come up with.

2. April 14, 2007 by Jeremy McCullough

I've put in some formulas to count how many of each item there were.

The data looked to plug into a pie graph rather nicely.

There was a lot of different fields, and I'm not sure how you wanted each consolidated, so I didn't touch anything else. There were a few resolutions that only returned one, which would probably be better grouped under 'Other'.

But I'll leave that to you. :)

It may require you to manually add up some numbers, though, but that shouldn't be near as hard as counting 1385*3 fields.

Hope it helped.

The file is at: http://www.disidius.com/webstats.xls

3. April 14, 2007 by Roger Johansson

Wow, that was quick, and many many thanks - that looks very useful! Like you say I will want to manually do some consolidation, but this is a good start.

Thanks again. I don't know why this was so hard for me to do myself, but..

4. April 14, 2007 by Tigerblade

Looking at Jeremy's file, there are also a few results that were probably typos, or didn't get gathered correctly by the script. Like, there's "Mac OS X", "Mac OSX", and "Max OS X" as three separate entries. Those are all supposed to be the same (probably), but they end up being read as different systems.

Or 1024x768 vs 1024x786. I'm guessing those are supposed to be the same resolution, just got a visit from the typo fairy.

5. April 14, 2007 by Kevin Laurence

The best way to analyze the data in Excel is to use a Pivot Table, but they can be tricky.

On the other hand, if you know what you're doing they can be quite quick.

See http://www.kevinlaurence.net/Results_456BereaSt.xls for my attempt.

Kevin

6. April 14, 2007 by Roger Johansson

Tigerblade: Yep, there are a few typos that I missed while cleaning up the data.

Kevin: I had a feeling that a pivot table would be the way to go, but I couldn't figure it out. Thanks for the help! Now I have a couple of different files to learn from, and I should be able to use them to present the results nicely.

7. April 14, 2007 by Ted Goas

Jeremy, that WAS quick! Roger, let me know if you'd like any additional help. I've been heavily involved in stats like this for the past few months, from a few personal studies. However, I won't bother downloading the csv if you're already content.

Looking forward to seeing the data and some trends from the data! The poll looked very successful from all the responses.

8. April 14, 2007 by xxdesmus

I have some numbers for you, but they aren't very pretty right now.

For example,

MAXIMIZED: False - 687 (49.6%) True - 698 (50.4%)

RESOLUTION: 1024x768 - 159 (11.5%) 1280x1024 - 450 (32.5%) 1280x800 - 110 (7.9%) ...etc

OS: Mac OS X - 450 (31.9%) Ubuntu - 77 (5.6%) Windows 2000 - 30 (2.2%) Windows XP - 691 (49.9%) Windows Vista - 46 (3.3%) ...etc

I have a plethora of more numbers if you're interested drop me an email :)

9. April 14, 2007 by Rik Hemsley

My girlfriend helped me out with her Excel skills. We came up with this graph: http://rikkus.info/arch/screen-resolution-and-maximising.png It shows percentage of maximisers by resolution and OS.

Our analysis of the results, based on the graph, is:

  1. The greater the screen resolution you have, the less likely you are to maximise your browser window.

  2. Mac OS users are much less likely than Windows, Linux and BSD users to maximise their browser windows.

10. April 14, 2007 by Rik Hemsley

In case you want to see the spreadsheet we used to make that graph (we simplified the data a great deal) it's here:

http://rikkus.info/arch/os.xlsx

Excel 2007 needed, sorry!

11. April 14, 2007 by xxdesmus

^^ Your direct link doesn't work, it thinks it's a hotlink.

12. April 15, 2007 by Proud

Why bother with Excel? Just use Keynote! Open up a slide, click the "Charts" button up top, and BAM! a chart editor. (And it's so much sexier too.)

13. April 15, 2007 by Proud

Oops, I didn't realize the data was largely unprocessed. Never mind then!

14. April 15, 2007 by Marcelo Wolfgang

I've made one two, if you want to try it it's at:

http:/work.grillo.tk/excel/data.xlxs

My gf help me too since web designers can't excel :)

15. April 15, 2007 by Gary Turner

Roger, I think you asked for the wrong data, or for not enough. The screen size can only tell you the maximum available window size, and has no direct relationship to the window size in use. You should ask for the browser window size in use. See this on actual browser sizes if you haven't already.

Second, I think it would help to separate the width from the height. It is the width that's of primary interest, is it not? I ran a simple sed replacement command against the csv file.

gt@koko:~/456$ sed "s/x/,/' < 456.csv > 456a.csv

Now you have a four column table and can look at width alone.

Some points, interesting and otherwise:

The median screen size is 1280×1024

Slightly more than half (698 of 1385) run their browsers maximized.

Of those, 572 have screens wider than 1024, thus wasting an amazing amount of desktop. (OK, that's editorializing)

Of the folks who don't run maximized, there is nothing of value to glean.

cheers,

gary

16. April 15, 2007 by Harvey Ramer

This post brings to mind the useful reporting that Google Analytics does. Here's an example from the past week: Screen Resolution Graphs

17. April 15, 2007 by Roger Johansson

Rik, Marcelo: Unfortunately I don't have Excel 2007 (and I doubt I ever will since I'm a Mac user), so I can't open your files. Thanks anyway :-).

Gary: Yep, I'm aware of the screen size vs. actual window size difference. I figured it would be much easier for people to just tell me their screen size.

I have actually started to do some cleaning to look at screen width only, so thanks for that sed replacement command.

18. April 15, 2007 by Chris San-Claire

Here is my bit, It could be much better with time not to sure how you wanted to use the data. I have no done this for long time...

Here is the Execl sheet tp download. hope it helps...

19. April 16, 2007 by Michel

@Roger:

Excel 2007? Well, I think Microsoft right now works on the next Office for Mac, so don't worry:) In worst case, ask someone to convert the file for thee, it's not that hard;-)

(Sorry, can't help, I am hopeless WinXP user up to now;-)

20. April 26, 2007 by Jason Crowther

Get up to speed on Excel in a flash with Microsoft Office Excel 2003 for Windows (Visual QuickStart Guide)

Task oriented and direct!

21. May 3, 2007 by Chris

A very easy way to handle data is through a database. ie- Access. You can count instances, sum values, etc.

I looked at the .csv file but I didn't know what the first column of true/false indicated so I couldn't really play.

Hope this helps -C

Sorry, comments are closed for this post.

Information, sponsorship, and externals

About the author

Roger Johansson is a Swedish web professional specialising in web standards, accessibility, and usability. More about me and this site.

Subscribe

Looking for web hosting?

Try DreamHost!

Use the promo code 456BEREASTREET3 to save USD 20 when you sign up!

Latest articles

Validation statistics from Nikita the Spider Comments off
An analysis of the sites crawled by the bulk validation tool Nikita the Spider during March 2008.
Authentic Jobs API and Affiliates program Comments off
The Authentic Jobs job listing service now has a public API and an affiliate program.
What does Acid3 mean to you and me? Comments off
Opera and Apple have announced that their web browsers pass the Acid3 Browser Test, but how will that help web designers and developers?
Designing Web Navigation (Book review) Comments off
Learn the fundamentals of navigation design and design better navigation systems for large and small sites as well as for web based applications.
DOMAssistant bundle for TextMate Comments off
To save keystrokes and speed up development I have created a DOMAssistant bundle for TextMate.
First impressions of Internet Explorer 8 Beta 1 Comments off
My impressions after trying out Internet Explorer 8 Beta 1 for a couple of days.

More articles

Favourites, here and elsewhere

Affiliation

  • NetRelations
  • Kaffesnobben
  • Dagens recept
  • 9rules network member

Support this site

Show your support by buying a book or two from SitePoint or getting me something from my Amazon Wish List.