I have a github repository that is predominately C++ but has lots of vendor-generated C code (drivers for a microcontroller) that is completely throwing off the language statistics. I have read this page and I have created a .gitattributes file in my repository that should mark all these driver files as linguist-vendored and keep them from being included in the statistics. Although git check-attr reports the linguist-vendored attribute as being set, the github-linguist command line tool still ignores this. What am I doing wrong?
$ cat .gitattributes
STM32[[:space:]]Code/*/** linguist-vendored
STM32[[:space:]]Code/*/Core/Src/** -linguist-vendored
STM32[[:space:]]Code/*/Core/Inc/** -linguist-vendored
$ git add .gitattributes
$ git commit --amend --no-edit
[master 017861e] fix github language metrics
Date: Sat Sep 25 16:09:00 2021 -0700
1 file changed, 3 insertions(+)
create mode 100644 .gitattributes
$ git check-attr -a "STM32 Code/BLDC/Drivers/STM32F3xx_HAL_Driver/Src/stm32f3xx_hal.c"
STM32 Code/BLDC/Drivers/STM32F3xx_HAL_Driver/Src/stm32f3xx_hal.c: linguist-vendored: set
$ github-linguist --breakdown
94.75% C
2.92% C++
2.09% Makefile
0.23% Assembly
0.01% Shell
...
C:
STM32 Code/BLDC/Drivers/STM32F3xx_HAL_Driver/Src/stm32f3xx_hal.c
...
I have also tried changing the .gitattributes file to just
STM32[[:space:]]Code/** linguist-vendored
and it still doesn't ignore the files inside.
I suspect you may be using an old version of Linguist and/or rugged (which is what does the checking of the gitattributes); possibly the versions shipped with your OS as I can't reproduce your behaviour using the latest version which GitHub is using: