This is the first on a series of posts that analyzes the images on the home pages of hospital sites. This post in particular takes a look at all the images that make up a web page and finds the largest on each as determined by its size in kilobytes. Using the list on Seed #2, I created a script that accomplishes this task.
One of the tasks done by the scripts I use to explore hospital websites is that of taking screenshots so I can compare them historically through time. By doing this, one can quickly observe the tendencies on website designs and the differences between them. The gallery below shows these screenshots.
I’ve collected a new list of hospitals and their respective URLs to make this ‘Seed #2’ This list is created on June 2016 and it contains 946 hospitals.
Here’s the full list:
The following analysis was done using the seed #1 dataset and it determines the most commonly used HTML tags. This dataset contains the home page HTML of 1,657 hospitals. By parsing the source code of these sites, I was able to determine the most frequently used and the site with most tags on its home page.
First, these are the top 10 hospitals with the most tags. The average number of tags on pages in this dataset is 390 tags, while the median is 320.
This is what I am calling ‘Seed #1’ which is the dataset I am using to analyze on the coming blog posts. Even though I have years worth of data, I am planning to start small with a list of 1,657 hospital sites. This list is dated mid June 2015 and the results of analyzing reflect the snapshot of this date