Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts

The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.

Building genksyms with W=1 generates the following warnings:

YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples

The comment in the parser describes the current problem:

/* This wasn't really a typedef name but an identifier that
shadows one. */

Consider the following simple C code:

typedef int foo;
void my_func(foo foo) {}

In the function parameter list (foo foo), the first 'foo' is a type
specifier (typedef'ed as 'int'), while the second 'foo' is an identifier.

However, the lexer cannot distinguish between the two. Since 'foo' is
already typedef'ed, the lexer returns TYPE for both instances, instead
of returning IDENT for the second one.

To support shadowed identifiers, TYPE can be reduced to either a
simple_type_specifier or a direct_abstract_declarator, which creates
a grammatical ambiguity.

Without analyzing the grammar context, it is very difficult to resolve
this correctly.

This commit introduces a flag, dont_want_type_specifier, which allows
the parser to inform the lexer whether an identifier is expected. When
dont_want_type_specifier is true, the type lookup is suppressed, and
the lexer returns IDENT regardless of any preceding typedef.

After this commit, only 3 shift/reduce conflicts will remain.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>

+26 -23
+3
scripts/genksyms/genksyms.h
··· 12 12 #ifndef MODUTILS_GENKSYMS_H 13 13 #define MODUTILS_GENKSYMS_H 1 14 14 15 + #include <stdbool.h> 15 16 #include <stdio.h> 16 17 17 18 #include <list_types.h> ··· 66 65 67 66 int yylex(void); 68 67 int yyparse(void); 68 + 69 + extern bool dont_want_type_specifier; 69 70 70 71 void error_with_pos(const char *, ...) __attribute__ ((format(printf, 1, 2))); 71 72
+8 -1
scripts/genksyms/lex.l
··· 12 12 %{ 13 13 14 14 #include <limits.h> 15 + #include <stdbool.h> 15 16 #include <stdlib.h> 16 17 #include <string.h> 17 18 #include <ctype.h> ··· 114 113 /* The second stage lexer. Here we incorporate knowledge of the state 115 114 of the parser to tailor the tokens that are returned. */ 116 115 116 + /* 117 + * The lexer cannot distinguish whether a typedef'ed string is a TYPE or an 118 + * IDENT. We need a hint from the parser to handle this accurately. 119 + */ 120 + bool dont_want_type_specifier; 121 + 117 122 int 118 123 yylex(void) 119 124 { ··· 214 207 goto repeat; 215 208 } 216 209 } 217 - if (!suppress_type_lookup) 210 + if (!suppress_type_lookup && !dont_want_type_specifier) 218 211 { 219 212 if (find_symbol(yytext, SYM_TYPEDEF, 1)) 220 213 token = TYPE;
+15 -22
scripts/genksyms/parse.y
··· 12 12 %{ 13 13 14 14 #include <assert.h> 15 + #include <stdbool.h> 15 16 #include <stdlib.h> 16 17 #include <string.h> 17 18 #include "genksyms.h" ··· 149 148 current_name = NULL; 150 149 } 151 150 $$ = $3; 151 + dont_want_type_specifier = false; 152 152 } 153 153 ; 154 154 ··· 171 169 is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern); 172 170 current_name = NULL; 173 171 $$ = $1; 172 + dont_want_type_specifier = true; 174 173 } 175 174 | init_declarator_list ',' init_declarator 176 175 { struct string_list *decl = *$3; ··· 187 184 is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern); 188 185 current_name = NULL; 189 186 $$ = $3; 187 + dont_want_type_specifier = true; 190 188 } 191 189 ; 192 190 ··· 214 210 remove_node($1); 215 211 $$ = $1; 216 212 } 217 - | type_specifier 213 + | type_specifier { dont_want_type_specifier = true; $$ = $1; } 218 214 | type_qualifier 219 215 ; 220 216 ··· 311 307 current_name = (*$1)->string; 312 308 $$ = $1; 313 309 } 314 - } 315 - | TYPE 316 - { if (current_name != NULL) { 317 - error_with_pos("unexpected second declaration name"); 318 - YYERROR; 319 - } else { 320 - current_name = (*$1)->string; 321 - $$ = $1; 322 - } 310 + dont_want_type_specifier = false; 323 311 } 324 312 | direct_declarator '(' parameter_declaration_clause ')' 325 313 { $$ = $4; } ··· 331 335 ; 332 336 333 337 direct_nested_declarator: 334 - IDENT 335 - | TYPE 338 + IDENT { $$ = $1; dont_want_type_specifier = false; } 336 339 | direct_nested_declarator '(' parameter_declaration_clause ')' 337 340 { $$ = $4; } 338 341 | direct_nested_declarator '(' error ')' ··· 357 362 358 363 parameter_declaration_list: 359 364 parameter_declaration 365 + { $$ = $1; dont_want_type_specifier = false; } 360 366 | parameter_declaration_list ',' parameter_declaration 361 - { $$ = $3; } 367 + { $$ = $3; dont_want_type_specifier = false; } 362 368 ; 363 369 364 370 parameter_declaration: ··· 371 375 ptr_operator abstract_declarator 372 376 { $$ = $2 ? $2 : $1; } 373 377 | direct_abstract_declarator 378 + { $$ = $1; dont_want_type_specifier = false; } 374 379 ; 375 380 376 381 direct_abstract_declarator: ··· 380 383 { /* For version 2 checksums, we don't want to remember 381 384 private parameter names. */ 382 385 remove_node($1); 383 - $$ = $1; 384 - } 385 - /* This wasn't really a typedef name but an identifier that 386 - shadows one. */ 387 - | TYPE 388 - { remove_node($1); 389 386 $$ = $1; 390 387 } 391 388 | direct_abstract_declarator '(' parameter_declaration_clause ')' ··· 431 440 432 441 member_declaration: 433 442 decl_specifier_seq_opt member_declarator_list_opt ';' 434 - { $$ = $3; } 443 + { $$ = $3; dont_want_type_specifier = false; } 435 444 | error ';' 436 - { $$ = $2; } 445 + { $$ = $2; dont_want_type_specifier = false; } 437 446 ; 438 447 439 448 member_declarator_list_opt: ··· 443 452 444 453 member_declarator_list: 445 454 member_declarator 446 - | member_declarator_list ',' member_declarator { $$ = $3; } 455 + { $$ = $1; dont_want_type_specifier = true; } 456 + | member_declarator_list ',' member_declarator 457 + { $$ = $3; dont_want_type_specifier = true; } 447 458 ; 448 459 449 460 member_declarator: