Adding CPUInfo to Sysinfo

There is a lot of interesting information about the processor hardware in /proc/cpuinfo. Here is a little bit from one of my NUC servers:
processor	: 0
vendor_id : GenuineIntel
cpu family : 6
model : 69
model name : Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz
stepping : 1
microcode : 0x16
cpu MHz : 779.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips : 3791.14
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:

The content of "cat /proc/cpuinfo" is actually four copies this, with small variations in core id (ranging between 0 and 1), the processor (ranging between 0 and 3), and the apcid (ranging from 0 to 3).

In order to add this information to my sysinfo.py I wrote a new module, cpuinfo.py, modeled on the df.py module that I used to add filesystem information.
""" Parse the content of /proc/cpuinfo and create JSON objects for each cpu

Written by Marc Donner
$Id: cpuinfo.py,v 1.7 2014/11/06 18:25:30 marc Exp marc $

"""

import subprocess
import json
import re

def main():
"""Main routine"""
print CPUInfo().to_json()
return

# Utility routine ...
#
# The /proc/cpuinfo content is a set of (attribute, value records)
# the separator between attribute and value is "/t+: "
#
# When there are multiple CPUs, there's a blank line between sets
# of lines.
#

class CPUInfo(object):
""" An object with key data from the content of the /proc/cpuinfo file """

def __init__(self):
self.cpus = {}
self.populated = False

def to_json(self):
""" Display the object as a JSON string (prettyprinted) """
if not self.populated:
self.populate()
return json.dumps(self.cpus, sort_keys=True, indent=2)

def get_array(self):
""" return the array of cpus """
if not self.populated:
self.populate()
return self.cpus["processors"]

def populate(self):
""" get the content of /proc/cpuinfo and populate the arrays """
self.cpus["processors"] = []
cpu = {}
cpu["processor"] = {}
text = str(subprocess.check_output(["cat", "/proc/cpuinfo"])).rstrip()
lines = text.split('n')
# Use re.split because there's a varying number of tabs :-(
array = [re.split('t+: ', x) for x in lines]
# cpuinfo is structured as n blocks of data, one per logical processor
# o each block has the processor id (0, 1, ...) as its first row.
# o each block ends with a blank row
# o some of the rows have attributes but no values
# (e.g. power_management)
for row in range(0, len(array[:])):
# New processor detected - attach this one to the output, then
if len(lines[row]) == 0:
# create a new processor
self.cpus["processors"].append(cpu)
cpu = {}
cpu["processor"] = {}
if len(array[row]) == 2:
(attribute, value) = array[row]
attribute = attribute.replace(" ", "_")
cpu["processor"][attribute] = value
self.cpus["processors"].append(cpu)
self.populated = True

if __name__ == '__main__':
main()

The state machine implicit in the main loop of populate() is plausibly efficient, though there remains something about it that annoys me. I need to think about edge cases and failure modes to see whether I can make it better.

The result is an augmented json object including info on the logical processors:
cat crepe.sysinfo 
{
"boot_time": "system boot 2014-09-14 16:03",
"bufferram": 193994752,
"distro_codename": "trusty",
"distro_description": "Ubuntu 14.04.1 LTS",
"distro_distributor": "Ubuntu",
"distro_release": "14.04",
"filesystems": [
{
"filesystem": {
"mount_point": "/",
"name": "/dev/sda1",
"size": "444919888",
"used": "3038660"
}
},
{
"filesystem": {
"mount_point": "/sys/fs/cgroup",
"name": "none",
"size": "4",
"used": "0"
}
},
{
"filesystem": {
"mount_point": "/dev",
"name": "udev",
"size": "8169708",
"used": "4"
}
},
{
"filesystem": {
"mount_point": "/run",
"name": "tmpfs",
"size": "1636112",
"used": "564"
}
},
{
"filesystem": {
"mount_point": "/run/lock",
"name": "none",
"size": "5120",
"used": "0"
}
},
{
"filesystem": {
"mount_point": "/run/shm",
"name": "none",
"size": "8180548",
"used": "4"
}
},
{
"filesystem": {
"mount_point": "/run/user",
"name": "none",
"size": "102400",
"used": "0"
}
}
],
"freeram": 12954943488,
"freeswap": 17103319040,
"hardware_platform": "x86_64",
"kernel_name": "Linux",
"kernel_release": "3.13.0-35-generic",
"kernel_version": "#62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014",
"machine": "x86_64",
"mem_unit": 1,
"nodename": "crepe",
"operating_system": "GNU/Linux",
"processor": "x86_64",
"processors": [
{
"processor": {
"address_sizes": "39 bits physical, 48 bits virtual",
"apicid": "0",
"bogomips": "3791.14",
"cache_alignment": "64",
"cache_size": "3072 KB",
"clflush_size": "64",
"core_id": "0",
"cpu_MHz": "779.000",
"cpu_cores": "2",
"cpu_family": "6",
"cpuid_level": "13",
"flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid",
"fpu": "yes",
"fpu_exception": "yes",
"initial_apicid": "0",
"microcode": "0x16",
"model": "69",
"model_name": "Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz",
"physical_id": "0",
"processor": "0",
"siblings": "4",
"stepping": "1",
"vendor_id": "GenuineIntel",
"wp": "yes"
}
},
{
"processor": {
"address_sizes": "39 bits physical, 48 bits virtual",
"apicid": "2",
"bogomips": "3791.14",
"cache_alignment": "64",
"cache_size": "3072 KB",
"clflush_size": "64",
"core_id": "1",
"cpu_MHz": "779.000",
"cpu_cores": "2",
"cpu_family": "6",
"cpuid_level": "13",
"flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid",
"fpu": "yes",
"fpu_exception": "yes",
"initial_apicid": "2",
"microcode": "0x16",
"model": "69",
"model_name": "Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz",
"physical_id": "0",
"processor": "1",
"siblings": "4",
"stepping": "1",
"vendor_id": "GenuineIntel",
"wp": "yes"
}
},
{
"processor": {
"address_sizes": "39 bits physical, 48 bits virtual",
"apicid": "1",
"bogomips": "3791.14",
"cache_alignment": "64",
"cache_size": "3072 KB",
"clflush_size": "64",
"core_id": "0",
"cpu_MHz": "779.000",
"cpu_cores": "2",
"cpu_family": "6",
"cpuid_level": "13",
"flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid",
"fpu": "yes",
"fpu_exception": "yes",
"initial_apicid": "1",
"microcode": "0x16",
"model": "69",
"model_name": "Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz",
"physical_id": "0",
"processor": "2",
"siblings": "4",
"stepping": "1",
"vendor_id": "GenuineIntel",
"wp": "yes"
}
},
{
"processor": {
"address_sizes": "39 bits physical, 48 bits virtual",
"apicid": "3",
"bogomips": "3791.14",
"cache_alignment": "64",
"cache_size": "3072 KB",
"clflush_size": "64",
"core_id": "1",
"cpu_MHz": "1000.000",
"cpu_cores": "2",
"cpu_family": "6",
"cpuid_level": "13",
"flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid",
"fpu": "yes",
"fpu_exception": "yes",
"initial_apicid": "3",
"microcode": "0x16",
"model": "69",
"model_name": "Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz",
"physical_id": "0",
"processor": "3",
"siblings": "4",
"stepping": "1",
"vendor_id": "GenuineIntel",
"wp": "yes"
}
}
],
"report_date": "2014-11-06 13:27:06",
"sharedram": 0,
"totalhigh": 0,
"totalram": 16753766400,
"totalswap": 17103319040,
"uptime": 4573401
}

I am tempted to augment the module with a configuration capability that would let me set sysinfo up to restrict the set of data from /dev/cpuinfo that I actually include in the sysinfo structure. Do I need "fpu" and "fpu_exception" or "clflush_size" for the things that I will be using the sysinfo stuff for? I'm skeptical. If I make it a configurable filter I can always incorporate data elements after I decide they're interesting.

Decisions, decisions.

Moreover, the multiple repetition of the CPU information is annoying. The four attributes that vary are, processor, core id, apicid, and initial apicid. The values are structured thus (initial apicid seems never to vary from apicid):







processorcore idapicid
000
112
201
313


It would be much more sensible to reduce the size and complexity of the processors section by consolidating the common parts and displaying the variant sections in some sensible subsidiary fashion.

These items are discussed in this Intel web page.

Comments

Popular posts from this blog

Quora Greatest Hits - What are common stages that PhD student researchers go through with their thesis project?

Two Intel NUC servers running Ubuntu

Important Patents - Procrastination