Support custom types#303
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends dstep’s macro parsing and type translation to support “custom” type placeholders (types not known at macro definition time) by representing them as CXTypeKind.unexposed and post-processing ambiguous cast-like patterns.
Changes:
- Update type translation helpers to take a full
clang.Typeand emitunexposedtype spellings when no declaration is available. - Extend macro type-name parsing to optionally accept arbitrary identifiers as type-names in
cast/sizeofcontexts. - Add a
fixCastspost-pass to rewrite some ambiguous(SYMBOL) <unary-op> ...parses into binary operator expressions whenSYMBOLmatches a macro parameter.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
dstep/translator/Type.d |
Adjusts low-level type translation to accept Type and emit spelling for unexposed when needed for custom-type macros. |
dstep/translator/MacroParser.d |
Adds “any type” parsing in cast/sizeof, accepts keywords as identifiers in primary expressions, and introduces fixCasts post-processing. |
Comments suppressed due to low confidence (1)
dstep/translator/MacroParser.d:981
parseCastExprnow callsparseTypeName(..., anyType=true), which will treat any identifier in parentheses as a type (viaCXTypeKind.unexposed). This makes patterns like(bar)(a)parse as a cast to the unknown typebarinstead of a function callbar(a), regressing existing behavior (there are unit tests for translating(bar)(a)as a call). Consider tightening the heuristic: only accept an unexposed identifier as a type in cast context when it can be proven to be a type (e.g., present in the type table) or when it matches a macro type-parameter, and otherwise fall back to parsing as a parenthesized expression/call.
Expression parseCastExpr(ref Token[] tokens, Cursor[string] table, bool defined)
{
auto local = tokens;
if (!accept!("(")(local, TokenKind.punctuation))
return parseUnaryExpr(tokens, table, defined);
Type type = parseTypeName(local, table, true);
if (!type.isValid)
return parseUnaryExpr(tokens, table, defined);
if (!accept!(")")(local, TokenKind.punctuation))
return parseUnaryExpr(tokens, table, defined);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| local = tokens; | ||
| if (accept(local, spelling, TokenKind.identifier) || | ||
| accept(local, spelling, TokenKind.keyword)) | ||
| { | ||
| tokens = local; | ||
| return Expression(Identifier(spelling)); | ||
|
|
||
| auto local = tokens; | ||
| } |
There was a problem hiding this comment.
parsePrimaryExpr now accepts TokenKind.keyword as an Identifier. This will cause macro bodies that expand to keywords (e.g., __declspec, static, const, struct, etc.) to start producing output instead of being dropped, and the resulting D expression is often invalid (e.g., enum STATIC = static; or enum CINDEX_LINKAGE = __declspec(dllexport);). If the goal is “convert as much as possible”, consider emitting these as commented-out/disabled nodes or gating keyword-as-identifier behind an option so generated bindings remain compilable and functional snapshot tests don’t regress.
| } else if (["+", "-"].canFind(unaryExpr.operator)) | ||
| { | ||
| auto newExpr = new AddExpr(); | ||
| newExpr.operator = unaryExpr.operator; | ||
| newExpr.left = Identifier(castExpr.type.spelling); | ||
| newExpr.right = unaryExpr.subexpr; | ||
| expr = newExpr; | ||
| } else if (["*", "/", "%"].canFind(unaryExpr.operator)) | ||
| { | ||
| auto newExpr = new MulExpr(); | ||
| newExpr.operator = unaryExpr.operator; | ||
| newExpr.left = Identifier(castExpr.type.spelling); | ||
| newExpr.right = unaryExpr.subexpr; | ||
| expr = newExpr; |
There was a problem hiding this comment.
In fixCasts, the MulExpr branch checks for "/" and "%", but parseUnaryExpr never produces unary operators for / or %, so these cases are unreachable. Consider narrowing this to just "*" (dereference) to reduce confusion, or expanding the logic to match the actual operators that can appear here.
| @@ -1411,6 +1418,9 @@ Type parseTypeName(ref Token[] tokens, Cursor[string] table) | |||
| if (!parseSpecifierQualifierList(local, type, table)) | |||
| return type; | |||
|
|
|||
| if (anyType && acceptIdentifier(local, type.spelling)) | |||
| type.kind = CXTypeKind.unexposed; | |||
|
|
|||
| parseAbstractDeclarator(local, type, table); | |||
|
|
|||
| tokens = local; | |||
There was a problem hiding this comment.
This change introduces new behavior (treating unknown identifiers in cast/sizeof type-names as unexposed custom types, plus fixCasts). There doesn’t appear to be unit test coverage for the motivating case (e.g. #define FIELD_PTR(Record, TYPE, Field) ((TYPE *)&((Record)->Field))) or for ambiguous (SYMBOL) cases that fixCasts is meant to correct. Adding focused MacroTranslTests for these scenarios would help prevent regressions (especially around distinguishing casts vs calls).
Consider C code like:
Currently
dstepis not able to parse such macro definition because cast'sTYPEis not known at definition time but will be only known at usage time. So such definitions are just ignored.This PR implements support for custom types that doesn't need to be known at definition time.
Note that it's not trivial to determine where
(SYMBOL)is cast with a type or simply a parameter/value so after parsing we usefixCaststo try to fix these places with correct usage.We also repurpose
CXTypeKind.unexposedfor these custom type definitions.With this PR result for such macro definition will be:
Note that this change has a side effect that definitions like:
Will be converted to:
While previously they were simply dropped.
But I consider this to be wanted change to convert as much code as possible and I will submit another PR that will comment out invalid code.