July 17, 2019July 17, 2019 by Jean-Luc Aufranc (CNXSoft) - 39 Comments

Checking Out Machine Check Exception (MCE) Errors in Linux

I recently reviewed ODROID-H2 with Ubuntu 19.04, and noticed some errors messages in the kernel log of the Intel Celeron J4105 single board computer while running SBC-Bench benchmark:

[180422.405294] mce: [Hardware Error]: Machine check events logged
[180425.656449] mce: [Hardware Error]: Machine check events logged
[180483.582825] mce_notify_irq: 17 callbacks suppressed
[180483.582827] mce: [Hardware Error]: Machine check events logged
[180484.991484] mce: [Hardware Error]: Machine check events logged
[180594.700684] mce_notify_irq: 13 callbacks suppressed
[180594.700686] mce: [Hardware Error]: Machine check events logged
[180858.202115] mce: [Hardware Error]: Machine check events logged
[181178.047031] mce: [Hardware Error]: Machine check events logged

[180422.405294] mce: [Hardware Error]: Machine check events logged

[180425.656449] mce: [Hardware Error]: Machine check events logged

[180483.582825] mce_notify_irq: 17 callbacks suppressed

[180483.582827] mce: [Hardware Error]: Machine check events logged

[180484.991484] mce: [Hardware Error]: Machine check events logged

[180594.700684] mce_notify_irq: 13 callbacks suppressed

[180594.700686] mce: [Hardware Error]: Machine check events logged

[180858.202115] mce: [Hardware Error]: Machine check events logged

[181178.047031] mce: [Hardware Error]: Machine check events logged

I did not know what do make of those errors, but I was told I would get more details with mcelog which can be installed as follows:

sudo apt install mcelog

1	sudo apt install mcelog

There’s just one little problem: it’s not in Ubuntu 19.04 repository, and a bug report mentions mcelog is not deprecated, and remove from Ubuntu 18.04 Bionic onwards. Instead, we’re being told the mcelog package functionality has been replaced by rasdaemon.

But before looking into the utilities, let’s find out what Machine Check Exception (MCE) is all about from ArchLinux Wiki:

A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred.

Machine check exceptions (MCEs) can occur for a variety of reasons ranging from undesired or out-of-spec voltages from the power supply, from cosmic radiation flipping bits in memory DIMMs or the CPU, or from other miscellaneous faults, including faulty software triggering hardware errors.

Hardware error should probably be taken seriously. Let’s investigate how to run the tools. First, I try to install mcelog from Ubuntu 16.04:

wget http://archive.ubuntu.com/ubuntu/pool/universe/m/mcelog/mcelog_128+dfsg-1_amd64.deb
sudo dpkg -i mcelog_128+dfsg-1_amd64.deb

1 2	wget http://archive.ubuntu.com/ubuntu/pool/universe/m/mcelog/mcelog_128+dfsg-1_amd64.deb sudo dpkg -i mcelog_128+dfsg-1_amd64.deb

Oh good! It could install… Let’s run some commands:

sudo mcelog
[sudo] password for odroid: 
mcelog: Family 6 Model 7a CPU: only decoding architectural errors
mcelog: warning: 32 bytes ignored in each record
mcelog: consider an update
odroid@ODROID-H2:~$ sudo mcelog --client
Memory errors
SOCKET 1 CHANNEL 5 DIMM 0
DMI_NAME "A1_DIMM0" DMI_LOCATION "A1_BANK0"
corrected memory errors:
	0 total
	0 in 24h
uncorrected memory errors:
	0 total
	0 in 24h

SOCKET 1 CHANNEL 5 DIMM 1
DMI_NAME "A1_DIMM1" DMI_LOCATION "A1_BANK1"
corrected memory errors:
	0 total
	0 in 24h
uncorrected memory errors:
	0 total
	0 in 24h

sudo mcelog

[sudo] password for odroid:

mcelog: Family 6 Model 7a CPU: only decoding architectural errors

mcelog: warning: 32 bytes ignored in each record

mcelog: consider an update

odroid@ODROID-H2:~$ sudo mcelog --client

Memory errors

SOCKET 1 CHANNEL 5 DIMM 0

DMI_NAME "A1_DIMM0" DMI_LOCATION "A1_BANK0"

corrected memory errors:

0 total

0 in 24h

uncorrected memory errors:

0 total

0 in 24h

SOCKET 1 CHANNEL 5 DIMM 1

DMI_NAME "A1_DIMM1" DMI_LOCATION "A1_BANK1"

corrected memory errors:

0 total

0 in 24h

uncorrected memory errors:

0 total

0 in 24h

Nothing interesting shows up here, but the file /var/log/mcelog is now up, and we can see details about the errors:

cat  /var/log/mcelog 
mcelog: Family 6 Model 7a CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 1 TSC bd2ee6710 
TIME 1563095601 Sun Jul 14 16:13:21 2019
MCG status:
MCi status:
Corrected error
Error enabled
Threshold based error status: green
MCA: corrected filtering (some unreported errors in same region)
Generic CACHE Level-2 Generic Error
STATUS 902000460082110a MCGSTATUS 0
MCGCAP c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 122
...

cat /var/log/mcelog

mcelog: Family 6 Model 7a CPU: only decoding architectural errors

Hardware event. This is not a software error.

MCE 0

CPU 0 BANK 1 TSC bd2ee6710

TIME 1563095601 Sun Jul 14 16:13:21 2019

MCG status:

MCi status:

Corrected error

Error enabled

Threshold based error status: green

MCA: corrected filtering (some unreported errors in same region)

Generic CACHE Level-2 Generic Error

STATUS 902000460082110a MCGSTATUS 0

MCGCAP c07 APICID 0 SOCKETID 0

CPUID Vendor Intel Family 6 Model 122

...

But let’s also try the recommended rasdaemon to see if we can get similar details.

Installation:

sudo apt install rasdaemon

1	sudo apt install rasdaemon

It looks like the service will not start automatically upon installation, so a reboot may be needed, or simply run the following command:

service rasdaemon start

1	service rasdaemon start

I ran a few commands and at first, it looked like some driver may be needed:

ras-mc-ctl --mainboard
ras-mc-ctl: mainboard: HARDKERNEL model ODROID-H2
sudo ras-mc-ctl --status
ras-mc-ctl: drivers not loaded.

ras-mc-ctl --mainboard

ras-mc-ctl: mainboard: HARDKERNEL model ODROID-H2

sudo ras-mc-ctl --status

ras-mc-ctl: drivers not loaded.

This should be related to EDAC drivers that are used for ECC memory according to a thread on Grokbase. Gemini Lake processors do not support ECC memory, so I probably don’t need it.

Running one more command to show the summary of errors, and we’re getting somewhere:

sudo ras-mc-ctl --summary
No Memory errors.

No PCIe AER errors.

No Extlog errors.
MCE records summary:
	12 corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error errors

sudo ras-mc-ctl --summary

No Memory errors.

No PCIe AER errors.

No Extlog errors.

MCE records summary:

12 corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error errors

12 corrected error related to the L2 cache. We can get the full details with the appropriate command:

sudo ras-mc-ctl --errors
No Memory errors.

No PCIe AER errors.

No Extlog errors.

MCE events:
1 2019-07-15 20:41:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x942000460082110a, addr=0x243e9f840, tsc=0x8b99a7f84108, walltime=0x5d2c8276, cpuid=0x000706a1, bank=0x00000001
2 2019-07-16 01:34:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x942000460082110a, addr=0x24b9df840, tsc=0xa38afb430944, walltime=0x5d2cc722, cpuid=0x000706a1, bank=0x00000001
3 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d95741ee28, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001
4 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d957436320, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001
5 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d957451d82, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001
6 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d957456482, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001
7 2019-07-16 03:20:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000400082110a, tsc=0xac3468f91976, walltime=0x5d2cdffa, cpuid=0x000706a1, bank=0x00000001
8 2019-07-16 03:20:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000400082110a, tsc=0xac3468fb7a3a, walltime=0x5d2cdffa, cpuid=0x000706a1, bank=0x00000001
9 2019-07-16 15:08:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000460082110a, tsc=0xe60f3181c782, walltime=0x5d2d85ea, cpuid=0x000706a1, bank=0x00000001
10 2019-07-16 15:08:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000460082110a, tsc=0xe60f31852002, walltime=0x5d2d85ea, cpuid=0x000706a1, bank=0x00000001
11 2019-07-17 02:52:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x942000460082110a, addr=0x249c5f840, tsc=0x11f964ae442b2, walltime=0x5d2e2aea, cpuid=0x000706a1, bank=0x00000001
12 2019-07-17 15:24:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000440082110a, tsc=0x15d0984e5de54, walltime=0x5d2edb2a, cpuid=0x000706a1, bank=0x00000001

sudo ras-mc-ctl --errors

No Memory errors.

No PCIe AER errors.

No Extlog errors.

MCE events:

1 2019-07-15 20:41:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x942000460082110a, addr=0x243e9f840, tsc=0x8b99a7f84108, walltime=0x5d2c8276, cpuid=0x000706a1, bank=0x00000001

2 2019-07-16 01:34:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x942000460082110a, addr=0x24b9df840, tsc=0xa38afb430944, walltime=0x5d2cc722, cpuid=0x000706a1, bank=0x00000001

3 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d95741ee28, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001

4 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d957436320, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001

5 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d957451d82, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001

6 2019-07-16 01:50:08 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000420082110a, tsc=0xa4d957456482, walltime=0x5d2ccae1, cpuid=0x000706a1, bank=0x00000001

7 2019-07-16 03:20:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000400082110a, tsc=0xac3468f91976, walltime=0x5d2cdffa, cpuid=0x000706a1, bank=0x00000001

8 2019-07-16 03:20:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000400082110a, tsc=0xac3468fb7a3a, walltime=0x5d2cdffa, cpuid=0x000706a1, bank=0x00000001

9 2019-07-16 15:08:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000460082110a, tsc=0xe60f3181c782, walltime=0x5d2d85ea, cpuid=0x000706a1, bank=0x00000001

10 2019-07-16 15:08:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000460082110a, tsc=0xe60f31852002, walltime=0x5d2d85ea, cpuid=0x000706a1, bank=0x00000001

11 2019-07-17 02:52:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x942000460082110a, addr=0x249c5f840, tsc=0x11f964ae442b2, walltime=0x5d2e2aea, cpuid=0x000706a1, bank=0x00000001

12 2019-07-17 15:24:09 +0700 error: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error, mcg mcgstatus=0, mci Corrected_error Error_enabled Threshold based error status: green, Large number of corrected cache errors. System operating, but might leadto uncorrected errors soon, mcgcap=0x00000c07, status=0x902000440082110a, tsc=0x15d0984e5de54, walltime=0x5d2edb2a, cpuid=0x000706a1, bank=0x00000001

The status is green which means everything still works, but the utility reports a “large number of corrected cache errors”, and the “system (is) operating, but might lead to uncorrected errors soon” (See source code). It happens only a few times a day, and I’m not sure what can be done about the cache since it’s not something that can be changed as it’s embedded into the processor, maybe it’s just an issue with the processor I’m running. If somebody has an ODROID-H2 running, it may be useful to check out the kernel log with dmesg to see if you’ve got the same errors. If you do, please also indicate whether you have a board from the first batch (November 2018) or one of the new ODROID-H2 Rev B boards.

Jean-Luc Aufranc (CNXSoft)

Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.

39 Replies to “Checking Out Machine Check Exception (MCE) Errors in Linux”

> the file /var/log/mcelog is now up

That’s how we monitor x86 commodity hardware: installing the mcelog package, then defining a primitive rule watching the size of /var/log/mcelog. Once the size exceeds 0 you have a problem.

As for the occurrences of these 2nd level cache errors: most probably they only occur once data is pumped through the CPU cores (e.g. running 7-zip or cpuminer as part of sbc-bench for example).

Jean-Luc Aufranc (CNXSoft) says:

July 17, 2019 at 17:44

Thanks. I also got some errors today, while ODROID-H2 was mostly idle.

Reply

Could someone provide me with some help?

I use a small firewall with IPFire and get mce errors. How is this to be interpreted?

# du -sh mcelog
24K    mcelog

1 2	# du -sh mcelog 24K mcelog

# cat /var/log/mcelog
Kernel does not support page offline interface
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4d35220 
TIME 1665391460 Mon Oct 10 10:44:20 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4f49c70 
TIME 1665391787 Mon Oct 10 10:49:47 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10a5fab60 
TIME 1665392770 Mon Oct 10 11:06:10 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c507a3b0 
TIME 1665396047 Mon Oct 10 12:00:47 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4eb3c80 
TIME 1665396375 Mon Oct 10 12:06:15 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b322690 
TIME 1665399324 Mon Oct 10 12:55:24 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4e4b9b0 
TIME 1665401618 Mon Oct 10 13:33:38 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b31ef00 
TIME 1665405878 Mon Oct 10 14:44:38 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10bfe3f80 
TIME 1665408499 Mon Oct 10 15:28:19 2022
MCG status:
MCi status:
Error overflow
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS c400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 4 
ADDR 1c5a01f00 
TIME 1665424555 Mon Oct 10 19:55:55 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 0 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b31ef00 
TIME 1665427177 Mon Oct 10 20:39:37 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4ce3b10 
TIME 1665429471 Mon Oct 10 21:17:51 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 11980e060 
TIME 1665431764 Mon Oct 10 21:56:04 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 11980f2f0 
TIME 1665432747 Mon Oct 10 22:12:27 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10ad88480 
TIME 1665435369 Mon Oct 10 22:56:09 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10bf3f7f0 
TIME 1665439301 Tue Oct 11 00:01:41 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10be87580 
TIME 1665439629 Tue Oct 11 00:07:09 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b250530 
TIME 1665439956 Tue Oct 11 00:12:36 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b31ef00 
TIME 1665441922 Tue Oct 11 00:45:22 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 277cc0470 
TIME 1665442250 Tue Oct 11 00:50:50 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b31bd80 
TIME 1665448148 Tue Oct 11 02:29:08 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c50d1740 
TIME 1665448804 Tue Oct 11 02:40:04 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c6d9ffa0 
TIME 1665450442 Tue Oct 11 03:07:22 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4d35250 
TIME 1665451425 Tue Oct 11 03:23:45 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10a60f7e0 
TIME 1665452736 Tue Oct 11 03:45:36 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10ad876b0 
TIME 1665456340 Tue Oct 11 04:45:40 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b322780 
TIME 1665458962 Tue Oct 11 05:29:22 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4e379a0 
TIME 1665459289 Tue Oct 11 05:34:49 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c50d1740 
TIME 1665461255 Tue Oct 11 06:07:35 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b322b60 
TIME 1665463877 Tue Oct 11 06:51:17 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10ac5c170 
TIME 1665471414 Tue Oct 11 08:56:54 2022
MCG status:
MCi status:
Error overflow
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS c400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b7b1280 
TIME 1665473707 Tue Oct 11 09:35:07 2022
MCG status:
MCi status:
Error overflow
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS c400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 13475fd50 
TIME 1665474363 Tue Oct 11 09:46:03 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b0c3c80 
TIME 1665474690 Tue Oct 11 09:51:30 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 277cc0470 
TIME 1665475673 Tue Oct 11 10:07:53 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b322750 
TIME 1665480916 Tue Oct 11 11:35:16 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 277c76c10 
TIME 1665481244 Tue Oct 11 11:40:44 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c5547b90 
TIME 1665485176 Tue Oct 11 12:46:16 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1045410a0 
TIME 1665485504 Tue Oct 11 12:51:44 2022
MCG status:
MCi status:
Error overflow
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS c400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10abdedd0 
TIME 1665487142 Tue Oct 11 13:19:02 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 277cf4e50 
TIME 1665491730 Tue Oct 11 14:35:30 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 103310090 
TIME 1665500905 Tue Oct 11 17:08:25 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 10b322a60 
TIME 1665503854 Tue Oct 11 17:57:34 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c4c04630 
TIME 1665508769 Tue Oct 11 19:19:29 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 1c53eef90 
TIME 1665509752 Tue Oct 11 19:35:52 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4
Hardware event. This is not a software error.
MCE 0
CPU 2 BANK 4 
ADDR 11288e730 
TIME 1665511063 Tue Oct 11 19:57:43 2022
MCG status:
MCi status:
Corrected error
MCi_ADDR register valid
MCA: Instruction CACHE Level-2 Instruction-Fetch Error
STATUS 8400000000010151 MCGSTATUS 0
MCGCAP 806 APICID 4 SOCKETID 0 
MICROCODE 411
CPUID Vendor Intel Family 6 Model 76 Step 4

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

# cat /var/log/mcelog

Kernel does not support page offline interface

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4d35220

TIME 1665391460 Mon Oct 10 10:44:20 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4f49c70

TIME 1665391787 Mon Oct 10 10:49:47 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10a5fab60

TIME 1665392770 Mon Oct 10 11:06:10 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c507a3b0

TIME 1665396047 Mon Oct 10 12:00:47 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4eb3c80

TIME 1665396375 Mon Oct 10 12:06:15 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b322690

TIME 1665399324 Mon Oct 10 12:55:24 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4e4b9b0

TIME 1665401618 Mon Oct 10 13:33:38 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b31ef00

TIME 1665405878 Mon Oct 10 14:44:38 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10bfe3f80

TIME 1665408499 Mon Oct 10 15:28:19 2022

MCG status:

MCi status:

Error overflow

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS c400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 0 BANK 4

ADDR 1c5a01f00

TIME 1665424555 Mon Oct 10 19:55:55 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 0 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b31ef00

TIME 1665427177 Mon Oct 10 20:39:37 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4ce3b10

TIME 1665429471 Mon Oct 10 21:17:51 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 11980e060

TIME 1665431764 Mon Oct 10 21:56:04 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 11980f2f0

TIME 1665432747 Mon Oct 10 22:12:27 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10ad88480

TIME 1665435369 Mon Oct 10 22:56:09 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10bf3f7f0

TIME 1665439301 Tue Oct 11 00:01:41 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10be87580

TIME 1665439629 Tue Oct 11 00:07:09 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b250530

TIME 1665439956 Tue Oct 11 00:12:36 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b31ef00

TIME 1665441922 Tue Oct 11 00:45:22 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 277cc0470

TIME 1665442250 Tue Oct 11 00:50:50 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b31bd80

TIME 1665448148 Tue Oct 11 02:29:08 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c50d1740

TIME 1665448804 Tue Oct 11 02:40:04 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c6d9ffa0

TIME 1665450442 Tue Oct 11 03:07:22 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4d35250

TIME 1665451425 Tue Oct 11 03:23:45 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10a60f7e0

TIME 1665452736 Tue Oct 11 03:45:36 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10ad876b0

TIME 1665456340 Tue Oct 11 04:45:40 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b322780

TIME 1665458962 Tue Oct 11 05:29:22 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4e379a0

TIME 1665459289 Tue Oct 11 05:34:49 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c50d1740

TIME 1665461255 Tue Oct 11 06:07:35 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b322b60

TIME 1665463877 Tue Oct 11 06:51:17 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10ac5c170

TIME 1665471414 Tue Oct 11 08:56:54 2022

MCG status:

MCi status:

Error overflow

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS c400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b7b1280

TIME 1665473707 Tue Oct 11 09:35:07 2022

MCG status:

MCi status:

Error overflow

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS c400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 13475fd50

TIME 1665474363 Tue Oct 11 09:46:03 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b0c3c80

TIME 1665474690 Tue Oct 11 09:51:30 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 277cc0470

TIME 1665475673 Tue Oct 11 10:07:53 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b322750

TIME 1665480916 Tue Oct 11 11:35:16 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 277c76c10

TIME 1665481244 Tue Oct 11 11:40:44 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c5547b90

TIME 1665485176 Tue Oct 11 12:46:16 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1045410a0

TIME 1665485504 Tue Oct 11 12:51:44 2022

MCG status:

MCi status:

Error overflow

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS c400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10abdedd0

TIME 1665487142 Tue Oct 11 13:19:02 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 277cf4e50

TIME 1665491730 Tue Oct 11 14:35:30 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 103310090

TIME 1665500905 Tue Oct 11 17:08:25 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 10b322a60

TIME 1665503854 Tue Oct 11 17:57:34 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c4c04630

TIME 1665508769 Tue Oct 11 19:19:29 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 1c53eef90

TIME 1665509752 Tue Oct 11 19:35:52 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

Hardware event. This is not a software error.

MCE 0

CPU 2 BANK 4

ADDR 11288e730

TIME 1665511063 Tue Oct 11 19:57:43 2022

MCG status:

MCi status:

Corrected error

MCi_ADDR register valid

MCA: Instruction CACHE Level-2 Instruction-Fetch Error

STATUS 8400000000010151 MCGSTATUS 0

MCGCAP 806 APICID 4 SOCKETID 0

MICROCODE 411

CPUID Vendor Intel Family 6 Model 76 Step 4

theguyuk says:

July 17, 2019 at 17:43

Might be worth also posting on Odroid H2 forum, issues or Ubuntu, maybe raise the number of board users reading this?

Reply
1. Jean-Luc Aufranc (CNXSoft) says:
  
  July 17, 2019 at 17:45
  
  I’m not sure it’s important yet. After all, the board works just fine. I’ll contact Hardkernel.
  
  Reply
  1. theguyuk says:
    
    July 17, 2019 at 17:59
    
    Seem to be incompatible memory posts on Odroid forum. Could it be memory or hardware chips driver issues?
    
    Reply
    1. Jean-Luc Aufranc (CNXSoft) says:
      
      July 17, 2019 at 19:10
      
      I’m using the DDR4 modules provided by Hardkernel, and there aren’t any memory errors, only L2 cache errors.
      What the problem could potentially be is that the latest batch of Gemini Lake processors has some cache issues (TBC), which may explain why I have somewhat lower performance compared to earlier boards.
      
      Reply
      1. willy says:
        
        July 17, 2019 at 20:55
        
        Cache issues in CPUs do happen. I’ve been hit many times with those in UltraSparc processors for example (and my U5 is sick again due to this for the second time by the way). I think modern CPUs use parity to better resist issues caused by heat and other random events. It’s “simple” enough to drop a line and re-read it when reading inconsistent data (it’s more difficult when changes are present but nothing prevents from re-reading the faulty word if it was not modified). In case that’s what you’re facing, there could be a corner case of locally-modified data which has changed before re-reading it and in this case it’s guaranteed data corruption. Better ask HK about it. If your board is the only one with this problem they might prefer to exchange it. I do have the exact same CPU on another board (asrock) and am not facing any such issues. I’ve just installed rasdaemon which noticed nothing either.
      2. Laurent says:
        
        July 17, 2019 at 21:04
        
        Some Intel chips go beyond parity checking and have ECC on L2. But I don’t know if this chip supports it, Intel doesn’t seem to document that openly.
        
        Found a command to check that: sudo dmidecode and look for “Error Correction Type”. On my i7-8650U it shows L1D is parity protected, L2 is one-bit ECC and L3 it multi-bit ECC.
      3. willy says:
        
        July 17, 2019 at 21:08
        
        Yes I’m pretty sure that high-end chips do have ECC, but here we’re talking about an Atom basically. Well, marketingly speaking it’s a celeron 🙂
      4. Laurent says:
        
        July 17, 2019 at 21:39
        
        willy, you’re correct for DRAM ECC, here we’re talking about the internal caches 😉
      5. willy says:
        
        July 17, 2019 at 22:05
        
        Laurent, I was also speaking about internal caches 🙂 I’m sure I’ve seen it mentioned a few times in the past for Xeons and have some memories of seeing things like “L2 cache ECC” in some server BIOSes in the past. Note that it might have been quite old since my memory associates this to L2 not L3. Also IBM’s POWER8 and above definitely use ECC for L2 & L3.
      6. Jean-Luc Aufranc (CNXSoft) says:
        
        July 17, 2019 at 21:15
        
        Here’s the output of sudo dmidecode on ODROID-H2 for anyone interested.
        https://pastebin.cnx-software.com/?f3cdd3bc98d9ae09#6uAEW7Ab2csDUCNS1qjFG9a1z8UT5R65FaWC7U3aePvo
        
        Error Correction Type: Single-bit ECC for L2 cache
      7. willy says:
        
        July 17, 2019 at 22:07
        
        The link doesn’t work here, it displays “privatebin is a minimalist …”.
      8. Jean-Luc Aufranc (CNXSoft) says:
        
        July 17, 2019 at 22:15
        
        It happens sometimes, and I have not figured out why yet: https://github.com/PrivateBin/PrivateBin/issues/453
        If you are on desktop, can you try to press Ctrl+F5? If JavaScript is disabled this may not work at all.
        In Android, data saver mode can create an issue: https://github.com/PrivateBin/PrivateBin/wiki/FAQ#how-to-make-privatebin-work-on-my-android-phone-with-data-saver-mode
      9. willy says:
        
        July 17, 2019 at 22:29
        
        Tried already, and again, but same result. It’s no big deal though, don’t worry 🙂

Hi there,

This is not normal. One error once in a couple months due to cosmic rays, that is normal.

You are being warned because at that rate there is something wrong with the hardware or too much EMI from another device.

tkaiser says:

July 18, 2019 at 20:15

> at that rate there is something wrong with the hardware

So far only ECCed (corrected) single bit flips in L2 cache. Harms slightly performance and will only be a real issue once two bits flip at the same time. While I wouldn’t trust such a CPU that much for the average use case a Gemini Lake box is taken (media center, desktop) this shouldn’t matter that much.

Reply
1. David Willmore says:
  
  July 18, 2019 at 21:40
  
  Physics is physics. You’re going to get bit flips in caches. Beta particles from C-14 decomposition. Beta particles from the Si itself, etc. Then there’s cosmic rays that you just can’t avoid as they’re everywhere. As transistors get smaller and smaller, Johnson–Nyquist noise becomes a bigger issue. The error may have even occured due to a transmission error on the chip–the right value may have been stored, but it got corrupted between the storage element and the ECC block.
  
  As long as the ECC is catching single bit errors at a relatively low rate, there’s noting to worry about. If it’s seeing double bit errors, then it’s time to be concerned.
  
  Reply
  1. willy says:
    
    July 19, 2019 at 10:21
    
    Sometimes there are 4 in the same second, this is far too much, something is busted in this machine, and its ability to recover from all the events you enumerated is affected by this existing one. The probability of two-bit *unrelated* errors remains very low. But if the hardware is defective, the probability of two-bit errors in the same cache line cannot be dismissed.
    
    Reply

My SSD makes some noise. Maybe that’s the source of the problem. I’ll remove it to check out what happens.

That is usually coil whine and is not the issue.

I’d try setting a lower maximum clock and see if the problem goes away. I’d want to check if power regulation/supply isn’t deficient.

If the problem is the hardware, a torture test like Prime 95 will also make it much worse.

David Willmore says:

July 21, 2019 at 20:38

I don’t use an x86 MB/memory combo without first doing at least 24 hours of memtest86 and then Prime95 on torture test. That helps validate memroy, processor, power delivery, etc.

Reply
1. Mikko Rantalainen says:
  
  November 17, 2020 at 20:52
  
  I used to run memtest86 for 24 hours, too. But then I got burned with broken RAM that memtest86 was not able to find no matter how long I tried. Didn’t know about Prime95 at that time but running sha1sum over all files in the filesystem repeatedly returned different hashes for the same files. And the problem went away by removing 2 of the 4 memory cards.
  
  As such, I think running memtest86 is just waste of time. Prime95 / mprime -t is much better option. However, you need to manually adjust the RAM usage to cover all chips. If you want to test your whole system, run max load *at the same time* (e.g. if you have high end GPU, run 3D benchmarks, run fio random read and writes to all storage devices).

I’ve disconnected all SATA data and power cable, and the errors are still there. It seems even worse than the first time I ran sbc-bench.sh:

dmesg output while running the benchmarks:

[ 4666.927174] mce: CMCI storm detected: switching to poll mode
[ 4719.901859] mce_notify_irq: 67 callbacks suppressed
[ 4719.901860] mce: [Hardware Error]: Machine check events logged
[ 4720.893787] mce: [Hardware Error]: Machine check events logged
[ 4780.888059] mce_notify_irq: 57 callbacks suppressed
[ 4780.888061] mce: [Hardware Error]: Machine check events logged
[ 4781.879955] mce: [Hardware Error]: Machine check events logged
[ 4841.874341] mce_notify_irq: 47 callbacks suppressed
[ 4841.874343] mce: [Hardware Error]: Machine check events logged
[ 4842.898260] mce: [Hardware Error]: Machine check events logged
[ 4908.876182] mce_notify_irq: 6 callbacks suppressed
[ 4908.876184] mce: [Hardware Error]: Machine check events logged
[ 4909.868060] mce: [Hardware Error]: Machine check events logged
[ 4971.878474] mce_notify_irq: 23 callbacks suppressed
[ 4971.878475] mce: [Hardware Error]: Machine check events logged
[ 4974.886197] mce: [Hardware Error]: Machine check events logged
[ 4986.997749] perf: interrupt took too long (2515 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
[ 5061.855379] mce_notify_irq: 5 callbacks suppressed
[ 5061.855380] mce: [Hardware Error]: Machine check events logged
[ 5063.870289] mce: [Hardware Error]: Machine check events logged
[ 5119.421001] perf: interrupt took too long (3162 > 3143), lowering kernel.perf_event_max_sample_rate to 63250
[ 5122.873103] mce_notify_irq: 17 callbacks suppressed
[ 5122.873104] mce: [Hardware Error]: Machine check events logged
[ 5126.872745] mce: [Hardware Error]: Machine check events logged
[ 5300.841693] mce_notify_irq: 6 callbacks suppressed
[ 5300.841695] mce: [Hardware Error]: Machine check events logged
[ 5301.833546] mce: [Hardware Error]: Machine check events logged
[ 5378.851010] mce_notify_irq: 13 callbacks suppressed
[ 5378.851012] mce: [Hardware Error]: Machine check events logged
[ 5379.842932] mce: [Hardware Error]: Machine check events logged
[ 5439.837883] mce_notify_irq: 26 callbacks suppressed
[ 5439.837884] mce: [Hardware Error]: Machine check events logged
[ 5440.829814] mce: [Hardware Error]: Machine check events logged
[ 5500.824759] mce_notify_irq: 32 callbacks suppressed
[ 5500.824760] mce: [Hardware Error]: Machine check events logged
[ 5502.840600] mce: [Hardware Error]: Machine check events logged

dmesg output while running the benchmarks:

[ 4666.927174] mce: CMCI storm detected: switching to poll mode

[ 4719.901859] mce_notify_irq: 67 callbacks suppressed

[ 4719.901860] mce: [Hardware Error]: Machine check events logged

[ 4720.893787] mce: [Hardware Error]: Machine check events logged

[ 4780.888059] mce_notify_irq: 57 callbacks suppressed

[ 4780.888061] mce: [Hardware Error]: Machine check events logged

[ 4781.879955] mce: [Hardware Error]: Machine check events logged

[ 4841.874341] mce_notify_irq: 47 callbacks suppressed

[ 4841.874343] mce: [Hardware Error]: Machine check events logged

[ 4842.898260] mce: [Hardware Error]: Machine check events logged

[ 4908.876182] mce_notify_irq: 6 callbacks suppressed

[ 4908.876184] mce: [Hardware Error]: Machine check events logged

[ 4909.868060] mce: [Hardware Error]: Machine check events logged

[ 4971.878474] mce_notify_irq: 23 callbacks suppressed

[ 4971.878475] mce: [Hardware Error]: Machine check events logged

[ 4974.886197] mce: [Hardware Error]: Machine check events logged

[ 4986.997749] perf: interrupt took too long (2515 > 2500), lowering kernel.perf_event_max_sample_rate to 79500

[ 5061.855379] mce_notify_irq: 5 callbacks suppressed

[ 5061.855380] mce: [Hardware Error]: Machine check events logged

[ 5063.870289] mce: [Hardware Error]: Machine check events logged

[ 5119.421001] perf: interrupt took too long (3162 > 3143), lowering kernel.perf_event_max_sample_rate to 63250

[ 5122.873103] mce_notify_irq: 17 callbacks suppressed

[ 5122.873104] mce: [Hardware Error]: Machine check events logged

[ 5126.872745] mce: [Hardware Error]: Machine check events logged

[ 5300.841693] mce_notify_irq: 6 callbacks suppressed

[ 5300.841695] mce: [Hardware Error]: Machine check events logged

[ 5301.833546] mce: [Hardware Error]: Machine check events logged

[ 5378.851010] mce_notify_irq: 13 callbacks suppressed

[ 5378.851012] mce: [Hardware Error]: Machine check events logged

[ 5379.842932] mce: [Hardware Error]: Machine check events logged

[ 5439.837883] mce_notify_irq: 26 callbacks suppressed

[ 5439.837884] mce: [Hardware Error]: Machine check events logged

[ 5440.829814] mce: [Hardware Error]: Machine check events logged

[ 5500.824759] mce_notify_irq: 32 callbacks suppressed

[ 5500.824760] mce: [Hardware Error]: Machine check events logged

[ 5502.840600] mce: [Hardware Error]: Machine check events logged

Before I ran sbc-bench.sh I had 924 errors logged, and after:

 sudo ras-mc-ctl --summary
[sudo] password for odroid: 
No Memory errors.

No PCIe AER errors.

No Extlog errors.
MCE records summary:
	1298 corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error errors

sudo ras-mc-ctl --summary

[sudo] password for odroid:

No Memory errors.

No PCIe AER errors.

No Extlog errors.

MCE records summary:

1298 corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error errors

I’ll try to change the max frequency and see what happens.

I’ve gone to the BIOS and changed two settings:

Turbo mode - Disabled
Boost performance mode - Max battery

1 2	Turbo mode - Disabled Boost performance mode - Max battery

ran sbc-bench again, and all MCE errors are gone:

 sudo ./sbc-bench.sh 
WARNING: this tool is meant to run only on Debian Stretch or Ubuntu Bionic.
When running on other distros results are partially meaningless or can't be collected.
Press [ctrl]-[c] to stop or [enter] to continue.

sbc-bench v0.6.7

Installing needed tools. This may take some time... Done.
Checking cpufreq OPP... Done.
Executing tinymembench. This will take a long time... Done.
Executing OpenSSL benchmark. This will take 3 minutes... Done.
Executing 7-zip benchmark. This will take a long time... Done.
Checking cpufreq OPP... Done.

Memory performance:
memcpy: 4133.3 MB/s 
memset: 5553.0 MB/s 

7-zip total scores (3 consecutive runs): 5614,5609,5588

OpenSSL results:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc     193865.69k   413265.43k   520197.12k   565172.91k   579775.15k   580577.96k
aes-128-cbc     193921.41k   413302.57k   520192.43k   565357.91k   579802.45k   580954.79k
aes-192-cbc     182091.22k   362645.35k   442027.18k   475023.36k   485468.84k   486025.90k
aes-192-cbc     182060.84k   363012.78k   441959.42k   474992.30k   485447.00k   484498.29k
aes-256-cbc     171556.62k   323270.63k   378806.10k   407945.90k   416776.19k   417518.93k
aes-256-cbc     171584.94k   323483.31k   378881.88k   407989.59k   417101.14k   417480.70k

Full results uploaded to http://ix.io/1P7S. Please check the log for anomalies (e.g. swapping
or throttling happened) and otherwise share this URL.

sudo ./sbc-bench.sh

WARNING: this tool is meant to run only on Debian Stretch or Ubuntu Bionic.

When running on other distros results are partially meaningless or can't be collected.

Press [ctrl]-[c] to stop or [enter] to continue.

sbc-bench v0.6.7

Installing needed tools. This may take some time... Done.

Checking cpufreq OPP... Done.

Executing tinymembench. This will take a long time... Done.

Executing OpenSSL benchmark. This will take 3 minutes... Done.

Executing 7-zip benchmark. This will take a long time... Done.

Checking cpufreq OPP... Done.

Memory performance:

memcpy: 4133.3 MB/s

memset: 5553.0 MB/s

7-zip total scores (3 consecutive runs): 5614,5609,5588

OpenSSL results:

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes

aes-128-cbc 193865.69k 413265.43k 520197.12k 565172.91k 579775.15k 580577.96k

aes-128-cbc 193921.41k 413302.57k 520192.43k 565357.91k 579802.45k 580954.79k

aes-192-cbc 182091.22k 362645.35k 442027.18k 475023.36k 485468.84k 486025.90k

aes-192-cbc 182060.84k 363012.78k 441959.42k 474992.30k 485447.00k 484498.29k

aes-256-cbc 171556.62k 323270.63k 378806.10k 407945.90k 416776.19k 417518.93k

aes-256-cbc 171584.94k 323483.31k 378881.88k 407989.59k 417101.14k 417480.70k

Full results uploaded to http://ix.io/1P7S. Please check the log for anomalies (e.g. swapping

or throttling happened) and otherwise share this URL.

tkaiser says:

July 22, 2019 at 15:19

> I’ve gone to the BIOS and changed two settings

And you lost around 1/3 of the CPU performance 🙂

Smells a bit like unstable high DVFS OPP…
Eversor says:

July 22, 2019 at 22:29

Yeah, any of the modes be the cause of this unless they’re bugged. I agree with Kaiser it’s DVFS bug probably. Could also be something on the hardware design – like capacitors – which can’t handle higher current swings.

Eversor says:

July 19, 2019 at 16:53

BTW, I once had an issue with a motherboard BIOS where it would have correctable and uncorrectable L1/L2 errors when doing light loads.

The board was initially K8 only but eventually supported Phenom 1 and 2.

Somewhere something broke, as the motherboard induced errors if I enabled frequency scaling on the K8 but worked fine with a Phenom II – which used a different version of C&Q.

K8 CPU worked fine if set at any fixed speed, just couldn’t change on demand. Prob was changing clocks too fast, without letting the VRM stabilize at the higher voltages first.

Reply

theguyuk says:

July 19, 2019 at 17:14

Have Odroid replied yet?

Reply
1. Jean-Luc Aufranc (CNXSoft) says:
  
  July 19, 2019 at 17:16
  
  Yes:
  
  I couldn’t see any L2 cache error correction yet since I booted my H2 around 30 hours ago.
  …
  I think there should be no critical issue probably because any single bit error in the Cache memory was corrected automatically.
  
  Reply
  1. willy says:
    
    July 19, 2019 at 21:55
    
    We could have hoped better, like “given that yours shows problems we cannot reproduce it definitely indicates a hardware issue and a risk of accelerated aging, we’re going to replace it”. Their response is a bit disappointing.
    
    Reply
    1. Jean-Luc Aufranc (CNXSoft) says:
      
      July 20, 2019 at 09:29
      
      For full context, it’s a review sample, and I did not pay for it. Hardkernel just sent the kit to me free of charge.
      
      Reply
      1. willy says:
        
        July 21, 2019 at 12:19
        
        OK that’s understandable then. Still they’d better make some statements like “oh we know our early samples were not perfect” than let the doubt exist about their hardware.
      2. Eversor says:
        
        July 22, 2019 at 22:31
        
        I agree with Willy – most manufacturers would want the board back for further testing. Doesn’t give much confidence that they will do proper Q/A.
Avra says:

July 28, 2019 at 02:32

Thank you for reporting MCE problem on Gemini Lake. It helped me not to buy it. I already have MCE errors on Apollo Lake Asrock J3455-ITX and thought that problem will not show on Gemini Lake. Unfortunately it does not seam to be the case…

avra@falcon:~$ uname -a
Linux falcon 4.19.0-0.bpo.2-amd64 #1 SMP Debian 4.19.16-1~bpo9+1 (2019-02-07) x86_64 GNU/Linux
avra@falcon:~$ dmesg | grep microcode
dmesg: read kernel buffer failed: Operation not permitted
avra@falcon:~$ sudo dmesg | grep microcode
[sudo] password for avra:
[ 0.961770] mce: [Hardware Error]: PROCESSOR 0:506c9 TIME 1564253320 SOCKET 0 APIC 0 microcode 1e
[ 3.731218] microcode: sig=0x506c9, pf=0x1, revision=0x1e
[ 3.731754] microcode: Microcode Update Driver: v2.2.
avra@falcon:~$ sudo dmesg | grep mce
[ 0.935565] mce: CPU supports 7 MCE banks
[ 0.961475] mce: [Hardware Error]: Machine check events logged
[ 0.961569] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: a600000000020408
[ 0.961673] mce: [Hardware Error]: TSC 0 ADDR fef135c0
[ 0.961770] mce: [Hardware Error]: PROCESSOR 0:506c9 TIME 1564253320 SOCKET 0 APIC 0 microcode 1e

Reply
1. Jean-Luc Aufranc (CNXSoft) says:
  
  July 28, 2019 at 11:05
  
  I think it’s a matter of luck. Most processors won’t have this issue, but some will have.
  Your problem looks a bit different. Does it cause you troubles or just output errors without actual user-facing issues?
  
  Reply
Hugh says:

August 26, 2019 at 21:36

This is an error in the processor. It could be induced by bad power or inappropriate clock speed. Other than that, it is likely a busted processor chip.

If you are just using the machine for benchmarking, who cares.
Otherwise, I would discard the board.
(I’m assuming that the processor is not socketted.)

The right way of thinking about ECC is as a life preserver. Your boat is sinking, but you can float long enough to get to another boat. You don’t use it to try to keep the old boat floating.

Reply
Lithopsian says:

October 15, 2021 at 00:01

The “Drivers not loaded” message is a bit spurious. It just looks in /proc/modules for anything having edac in the name and assumes that is good, regardless of whether it is the right driver for your machine. If it doesn’t see anything, for example if the correct edac driver is built in to the kernel, then it reports that the driver isn’t loaded. So the message has nothing to do with MCE and may or may not be reporting the status of EDAC monitoring.

Reply

Boardcon CM3588 Rockchip RK3588 System-on-Module designed for AI and IoT applications

39 Replies to “Checking Out Machine Check Exception (MCE) Errors in Linux”

Leave a Reply Cancel reply

Leave a Reply