The result/output of “fio” can be overwhelming because this decent tool does a lot for you. Your job is to feed “fio” with the right options and then interpret the result/output. This posting will help you to understand the result/output in detail. I know it’s difficult to read but I am limited by the WordPress design here a little bit and may improve it in the future.

The official documentation

The HOWTO provides some insights about the result/output of “fio”. I copy&paste some parts of the HOWTO and give you some more details or summarize other parts.

Output while running

1

2

3

4

5

6

7

fio spits outalot of output.Whilerunning,fio will display the

status of the jobs created.An example of that would be:

Threads:1:[_r][24.8%done][13509/8334kb/s][eta00h:01m:31s]

The characters inside the square brackets denote the current status of

eachthread.The possible values(intypical life cycle order)are:

Idle

Run

Description

P

Thread setup, but not started.

C

Thread created.

I

Thread initialized, waiting or generating necessary data.

p

Thread running pre-reading file(s).

R

Running, doing sequential reads.

r

Running, doing random reads.

W

Running, doing sequential writes.

w

Running, doing random writes.

M

Running, doing mixed sequential reads/writes.

m

Running, doing mixed random reads/writes.

F

Running, currently waiting for fsync()

f

Running, finishing up (writing IO logs, etc)

V

Running, doing verification of written data.

E

Thread exited, not reaped by main thread yet.

_

Thread reaped, or

X

Thread reaped, exited with an error.

K

Thread reaped, exited due to signal.

Job overview output

This will give you an overview about the jobs and the option used. It’s useful to check the heading if you receive only the results of a run but not the command line call or job file.

1

2

3

4

job-1:(g=0):rw=read,bs=4K-4K/4K-4K/4K-4K,ioengine=sync,iodepth=1

job-2:(g=0):rw=read,bs=4K-4K/4K-4K/4K-4K,ioengine=sync,iodepth=1

fio-2.2.9-26-g669e

Starting2processes

Data direction output

All details for each data direction will be shown here. Most important numbers are:

io=

bw=

iops=

issued =

lat =

Details inside the box.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

When fio isdone(orinterrupted by ctrl-c),it will show the data for

eachthread,group of threads,anddisks inthat order.Foreachdata

direction,the output looks like:

Client1(g=0):err=0:

write:io=32MB,bw=666KB/s,iops=89,runt=50320msec

slat(msec):min=0,max=136,avg=0.03,stdev=1.92

clat(msec):min=0,max=631,avg=48.50,stdev=86.82

bw(KB/s):min=0,max=1196,per=51.00%,avg=664.02,stdev=681.68

cpu:usr=1.49%,sys=0.25%,ctx=7969,majf=0,minf=17

IO depths:1=0.1%,2=0.3%,4=0.5%,8=99.0%,16=0.0%,32=0.0%,>32=0.0%

submit:0=0.0%,4=100.0%,8=0.0%,16=0.0%,32=0.0%,64=0.0%,>=64=0.0%

complete:0=0.0%,4=100.0%,8=0.0%,16=0.0%,32=0.0%,64=0.0%,>=64=0.0%

issuedr/w:total=0/32768,short=0/0

lat(msec):2=1.6%,4=0.0%,10=3.2%,20=12.8%,50=38.4%,100=24.8%,

lat(msec):250=15.2%,500=0.0%,750=0.0%,1000=0.0%,>=2048=0.0%

io=Number of megabytes io performed

bw=Average bandwidth rate

iops=Average IOs performed per second

runt=The runtime of that thread

slat=Submission latency(avg being the average,stdev being the

standard deviation).Thisisthe time it took tosubmit

the io.Forsync io,the slat isreally the completion

latency,since queue/complete isone operation there.This

value can be inmilliseconds ormicroseconds,fio will choose

the most appropriate base andprint that.Inthe example

above,milliseconds isthe best scale.Note:in--minimal mode

latencies are always expressed inmicroseconds.

clat=Completion latency.Same names asslat,thisdenotes the

time from submission tocompletion of the io pieces.For

sync io,clat will usually be equal(orvery close)to0,

asthe time from submit tocomplete isbasically just

CPU time(io has already been done,see slat explanation).

bw=Bandwidth.Same names asthe xlat stats,but also includes

an approximate percentage of total aggregate bandwidth

thisthread received inthisgroup.Thislast value is

only really useful ifthe threads inthisgroup are on the

same disk,since they are thencompeting fordisk access.

cpu=CPU usage.User andsystem time,along with the number

of context switches thisthread went through,usage of

system anduser time,andfinallythe number of major

andminor page faults.

IO depths=The distribution of io depths over the job life time.The

numbers are divided into powers of2,so forexample the

16=entries includes depths up tothat value but higher

than the previous entry.Inother words,it covers the

range from16to31.

IO submit=How many pieces of IO were submitting inasingle submit

call.Eachentry denotes that amount andbelow,until

the previous entry-eg,8=100%mean that we submitted

anywhere inbetween5-8ios per submit call.

IO complete=Like the above submit number,but forcompletions instead.

IO issued=The number of read/write requests issued,andhow many

of them were short.

IO latencies=The distribution of IO completion latencies.Thisisthe

time from when IO leaves fio andwhen it gets completed.

The numbers follow the same pattern asthe IO depths,

meaning that2=1.6%means that1.6%of the IO completed

within2msecs,20=12.8%means that12.8%of the IO

took more than10msecs,but less than(orequal to)20msecs.

Group statistics

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

After eachclient has been listed,the group statistics are printed.They

“clat percentile” gives a detailed explanation how much IO in percentage completed in which time frame. In this case: 99% of the IO completed in <=1192 usec = 1,2 msec. This value is often used to ignore the few spikes when testing. The maximum clat has been 13505 which has been ~14x longer than the average of 344.

“bw” min, max, per,avg, stdev

In this case the bandwidth has been 345090 KB/s = ~337 MB/s

“lat” this is like the clat part.

In this case 91.01% of the IO completed between 500usec and >250usec. This is in line with the avg latency of 360.23usec. Only ~ 8,7% of the IO took between 2ms and >750usec. Both together is nearly 99,8% of all IO.

“cpu”

this line is dedicated to the CPU usage of the running the job

“usr=5.66%”

this is the percentage of CPU usage of the running job at user level

100% would mean that one CPU core will be at 100% workload, depending if HT on/off

“sys=12.09%”

this is the percentage of CPU usage of the running job at system/kernel level