Formal Experiment Report

506.423 3VU Information Architecture and Web Usability WS 2003/2004

Group 5

Albert Strasser
Wolfgang Auer
Lei Ming

Readability of text for variing foreground/background colour combinations

Report of 19. Dec. 2003

1. Related Work

From all the papers on this research topic the following to us seemed the most interesting ones:

a.) Richard Hall - The Effect of Web page Text-Background colors[1]

In the experiment Hall tried to measure retention as an effect of text-background color combinations and also the subjective perception of the color combinations. In his experiment he used 30 students, who had to read two prepared homepages whith tow diffrent text-background combinations. After the test user has read the first homepage the user had to answer a questionair and the user also had to evaluate a few questions with the help of a 10 point likert scale. In the whole experiment he tested four diffrent text-backgournd color combinations.

The result was that there is no significant influence, between the diffrent color combinations, on retention.

b.) Chris Ridpath - Testing The Readability Of Web Page Colors [2]

In this experiment Ridpath took 42 diffrent text-background combinations. The test was based on a programm which was online. Test users evaluated the readability of the text-background combinations with the help of an likert skale.

The Result was that the the judgement, based on brightness difference and color difference, is not entirely accurate. The bigger the difference between the brightness of text and background color was, the better was the rating of the test users

c.) Lauren Scharff - Readability Of Websites With Various Foreground/Background Color Combinations, Font Types And Word Styles [3]

Scharff is one of the pioneers running tests to meassure the effect of color combinations. In the experiment Scharff took 43 participants,6 color combinations and three font types. The test users had to scan a text finding a shape word (e.g. circle). If they found one of the shape words they had to click on the corresponding shape at the bottom of the screen.

The result was, that there is a connection between the font type and the color combinations

2. User Profile

3. Test Methodology

3.1 Participants

Sixteen university students volunteered for this study (11 men and 5 women). They ranged in age from 21 to 26 (M = 24). The median web use for the participants was 10 hours per week.

 

  Date Time First Name Sex Age Education PC experience Web/week
TP1 25.11.03 18:03 Klaus m 24 Student 12 years 40 hours
TP2 25.11.03 18:30 Martin Fe. m 24 Student 8 years 10 hours
TP3 25.11.03 18:57 Clemens m 24 Student 5 years 10 hours
TP4 25.11.03 19:20 Susanne f 21 Student 6 years 34 hours
TP5 25.11.03 20:26 Matthias m 24 Student 10 years 3 hours
TP6 27.11.03 09:00 Reinhard m 25 Student 10 years 40 hours
TP7 27.11.03 10:10 Florian m 24 Student 10 years 40 hours
TP8 27.11.03 11:00 Raphaela f 24 Student 10 years 15 hours
TP9 27.11.03 18:45 Stefan m 24 Student 12 years 3 hours
TP10 27.11.03 19:15 Isabella f 22 Ergotherapist 10 years 8 hours
TP11 27.11.03 20:03 Pamela f 23 Student 10 years 2 hours
TP12 27.11.03 20:40 Marlene f 22 Student 4 years 3 hours
TP13 28.11.03 14:00 Markus m 26 Student 12 years 5 hours
TP14 28.11.03 14:47 Martin M. m 23 Student 10 years 15 hours
TP15 1.12.03 16:00 Martin Fa. m 25 Student 15 years 3 hours
TP16 1.12.03 16:15 Udo m 24 Student 9 years 30 hours

3.2 Test Design

For the test 16 participants (from now on called test users) who match the criteria above were chosen. After filling out the consent form and the background questionaire each of them had to read 4 different texts. All four texts consist of about 500 words and have about the same (very low) level of difficulty. Furthermore they contain 10 substitution words each randomly placed in the text. A substitution word means a word that replaces another word in the text to which it is very similar (e.g. rhyming) but with totally different meaning. A font size of 12pt was used, the font type was fixed (Verdana) and the text was displayed with 90cpl.

Each user had to read one combination of text and colorschemes like in the table below:

 

test conf Text/Colorscemes combinations
1
Text2
Text1
Text3
Text4
2
Text2
Text1
Text3
Text4
3
Text2
Text1
Text3
Text4
4
Text2
Text1
Text3
Text4
5
Text1
Text4
Text2
Text3
6
Text1
Text4
Text2
Text3
7
Text1
Text4
Text2
Text3
8
Text1
Text4
Text2
Text3
9
Text4
Text3
Text1
Text2
10
Text4
Text3
Text1
Text2
11
Text4
Text3
Text1
Text2
12
Text4
Text3
Text1
Text2
13
Text3
Text2
Text4
Text1
14
Text3
Text2
Text4
Text1
15
Text3
Text2
Text4
Text1
16
Text3
Text2
Text4
Text1

 

The assignment of each test user to a text/colour combination of the table given above was randomized by letting the user pick one of the numbers left. 

The used colours had the following values:

 
White   0xFFFFFF
Black   0x000000
Navy   0x000080

 

The test users were instructed to read the texts as fast and accurate as possible. We let them know about the existence of substitution words in the texts but the users didn't know how many of them there were. When users come upon a substitution word they should say it aloud.

We measured the time to accomplish reading and the number of found substitution words. The time is our indicator for reading speed and the number of substitution words for reading accuracy. These two figures lead us to the reading efficiency.

After having read all 4 texts each user had to fill out a post test questionaire containing mainly likert scale questions.

3.3 Test Tasks

Text 1: Abschied

Text 2: Selbstmord

Text 3: Überrollt

Text 4: Wohin

3.4 Test Environment

The test was carried out at the following address:

Sandgasse 25c/9, 8010 Graz

The test users were working with the following equipment:

Computer: Pentium 4 CPU, 2.6GHz and 512MB Ram
Monitor: Sony TFT with a resulution of 1280 x 1024 and 32bit color depth
Application: Internet Explorer 6.0

Following equipment was used to gather the necessary data:

DigiCam, VHS recorder, mirror placed so that the face of the user and the screen can be seen simultaneously, microphone, taking notes on paper.

 

4. Results

4.1 Objective Measures

The following table gives the results of the objective measures. The numbers are averages calculated over all test users. Note that the results refer to the number of missed words. Time is the time to complete the reading of the whole text.

 

#missed words std dev words time [sec] std dev time
White on Black 2.75 2.38 195 53.1
Black on White 1.94 1.12 194 43.3
Blue on White 1.5 1.32 200 42.3
White on Blue 2.06 1.61 198 46.3

 

A statistical analysis showed that the variation of values was too big to lead to a statistically significant result assuming a necessary limit of 95% of significance.

4.2 User Satisfaction

The following table gives the results of the feedback questionaire given to the test persons right after having done the test. The users were asked to do a rating from 0 to 6 on a likert skale.

 

Readability Concentration Aesthetics
White on Black 3.13 3.19 3.13
Black on White 3.56 3.69 4
Blue on White 3.25 3.38 3.81
White on Blue 3.44 2.94 3.75
(0...worst, 6...best)

 

std dev Readability std dev Concentration std dev Aesthetics
White on Black 2.58 1.83 2.22
Black on White 2.07 1.85 1.71
Blue on White 1.69 1.86 1.72
White on Blue 1.79 1.57 1.92
 

A statistical analysis showed that the variation of values was too big to lead to a statistically significant result assuming a necessary limit of 95% of significance.

 

Text Textdifficulty
Abschied 4
Selbstmord 3.81
Überrollt 2.56
Wohin 3.56
(0...hard, 6...easy)
 

The influence of the color combination on the readability was rated with 5.4 on the skale from 0 to 6 (0...no influence, 6...strong influence).

7 (43.8%) users preferred "Black on White", 4 (25%)users "Blue on White", 3 (18.8%)  users "White on Black" and 2 (12.4%) users "White on Blue".

5. Discussion

First of all plaese note that the results showed to be statistically not significant. All interpretation of the results can only be meant as a tendency.

In our experiments the most substitution words were found with "Blue on White" and the least with "White on Black" although the differences between the combinations are very small.

The fastest reading results had the "Black on White" combination, the slowest was "Blue on White". It is quite astonishing that the combination with the best reading accuracy, naimly "Blue on White", at the same time seems to slow down reading.

The subjective opinions of the users is very clear. "Black on White" is the winner in all 3 categories and with 43,8% the most preferred combination. The loosers are "White on Black" and "White on Blue", i.e. the combinations with negative contrast. With 2.94 (the only average rating less than 3) users think that "White on Blue" is bad for the concentration. Aesthetically "White on Black" is by far the worst with 3.13. The worst readability was attributed to "White on Black". Users don't like the schemes with darker background and lighter text color. While 68.8% preferred positive contrast only 31.2% preferred negative contrast.

Comparing objective and subjective measures we can state that the reading accuracy and the rating are very closely related, i.e. negative contrast let the users find less words. The reading speed doesn't reflect the user rating at all.

Although the influence of color combinations on readability was reated to be very high by the users themselves (5.4 out of 6) the measurements do not show enough difference to reflect this.

6. References

[1] Richard H. Hall: The Effect Of Web Page Text-Background Color Combinations On Retention and Perceived Readability, Aesthetics And Behavioral Intention.

[2] Ridpath/Treviranus/Weiss: Testing The Readability Of Web Page Colors

[3] Alyson Hill/Lauren Scharff: Readability Of Websites With Various Foreground/Background Color Combinations, Font Types And Word Styles.

7. Appendix

7.1 Orientation Script

Orientation Script

7.2 Background Questionnaire

Background Questionnaire

7.3 Consent Form

Consent Form

7.4 Feedback Questionnaire

Feedback Questionnaire