Chapter 19: Match Compilation (Match Compilation)
소개
Chapter 17에서는 패턴 매칭의 이론적 기반을 다뤘다:
- Decision tree 알고리즘 (Maranget 2008)
- Pattern matrix와 specialization/defaulting 연산
- Exhaustiveness checking과 unreachable case detection
Chapter 18에서는 패턴 매칭이 작동할 데이터 구조를 구현했다:
!funlang.list<T>parameterized typefunlang.nil과funlang.consoperations- TypeConverter로 tagged union 변환
- NilOpLowering과 ConsOpLowering patterns
Chapter 19에서는 모든 것을 종합하여 패턴 매칭 컴파일을 완성한다. funlang.match operation을 정의하고 SCF dialect로 lowering하여 실행 가능한 코드를 생성한다.
두 장의 복습: 왜 Match Operation이 필요한가?
Chapter 17에서 우리는 decision tree 알고리즘을 배웠다:
// F# 패턴 매칭 예제
let rec sum_list lst =
match lst with
| [] -> 0
| head :: tail -> head + sum_list tail
sum_list [1; 2; 3] // 6
Decision tree 컴파일 결과:
Switch on lst:
Case Nil -> return 0
Case Cons(head, tail) -> return head + sum_list tail
Chapter 18에서 우리는 리스트 데이터 구조를 구현했다:
// Empty list
%empty = funlang.nil : !funlang.list<i32>
// List construction: [1, 2, 3]
%three = arith.constant 3 : i32
%t3 = funlang.nil : !funlang.list<i32>
%l3 = funlang.cons %three, %t3 : (i32, !funlang.list<i32>) -> !funlang.list<i32>
%two = arith.constant 2 : i32
%l2 = funlang.cons %two, %l3 : (i32, !funlang.list<i32>) -> !funlang.list<i32>
%one = arith.constant 1 : i32
%l1 = funlang.cons %one, %l2 : (i32, !funlang.list<i32>) -> !funlang.list<i32>
이제 이 두 가지를 연결할 방법이 필요하다:
// 목표: sum_list를 MLIR로 표현
func.func @sum_list(%lst: !funlang.list<i32>) -> i32 {
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
%tail_sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
%sum = arith.addi %head, %tail_sum : i32
funlang.yield %sum : i32
}
return %result : i32
}
funlang.match: The Most Complex Operation
왜 funlang.match가 가장 복잡한가?
지금까지 우리가 구현한 FunLang operations:
| Operation | Complexity | Why |
|---|---|---|
funlang.nil | Simple | Zero arguments, constant value |
funlang.cons | Moderate | Two operands, GC allocation |
funlang.closure | Moderate | Function ref + captures, GC allocation |
funlang.apply | Moderate | Indirect call, block arguments |
funlang.match | Complex | Multiple regions, block arguments, type conversion |
funlang.match의 복잡성:
- Region-based structure: 각 case가 별도의 region (not just basic block)
- Variable number of cases: Nil/Cons 2개부터 임의의 pattern 개수까지
- Block arguments per case: Cons case는
(%head, %tail)같은 바인딩 필요 - Type conversion in regions: 각 region 내부의 operations도 lowering 필요
- Multi-stage lowering: FunLang → SCF → CF → LLVM
Chapter 15 Preview 복습:
Chapter 15에서 우리는 funlang.match를 미리 살짝 봤다:
// Chapter 15의 preview (간략 버전)
def FunLang_MatchOp : FunLang_Op<"match"> {
let summary = "Pattern matching operation";
let arguments = (ins AnyType:$input);
let results = (outs AnyType:$result);
let regions = (region VariadicRegion<SizedRegion<1>>:$cases);
}
Chapter 19에서는 완전한 버전을 구현한다:
- Full TableGen definition with verification
- Custom assembly format (parser/printer)
- C API shim for region-based operation
- F# bindings with builder callback
- Lowering pattern to SCF dialect
Multi-Stage Lowering: FunLang → SCF → LLVM
왜 SCF dialect를 거치는가?
Phase 5에서 우리는 FunLang operations를 직접 LLVM dialect로 lowering했다:
funlang.closure → llvm.alloca + llvm.store (direct lowering)
funlang.apply → llvm.load + llvm.call (direct lowering)
하지만 funlang.match는 다르다:
funlang.match → scf.index_switch → cf.switch → llvm.switch
(structured) (CFG) (machine)
이유:
- Structured control flow preservation: SCF는 high-level structure 유지
- Optimization opportunities: SCF level에서 최적화 가능 (dead case elimination, etc.)
- Debugging: SCF IR이 source 구조를 반영하여 디버깅 쉬움
- Separation of concerns: Pattern matching logic과 low-level branching 분리
SCF Dialect란?
SCF = Structured Control Flow
MLIR의 standard dialect 중 하나로, high-level control flow operations 제공:
// scf.if: two-way branching (Chapter 8에서 사용)
%result = scf.if %cond : i1 -> i32 {
%x = arith.constant 42 : i32
scf.yield %x : i32
} else {
%y = arith.constant 0 : i32
scf.yield %y : i32
}
// scf.index_switch: multi-way branching (Chapter 19에서 사용)
%result = scf.index_switch %tag : index -> i32
case 0 {
%zero = arith.constant 0 : i32
scf.yield %zero : i32
}
case 1 {
%one = arith.constant 1 : i32
scf.yield %one : i32
}
default {
%minus = arith.constant -1 : i32
scf.yield %minus : i32
}
SCF vs CF (Control Flow) dialect:
| Dialect | Level | Structure | When |
|---|---|---|---|
| SCF | High-level | Structured (nested regions) | Pattern matching, loops |
| CF | Low-level | Unstructured (goto-like) | After SCF lowering |
Complete lowering pipeline:
FunLang Dialect
↓ (FunLangToSCFPass)
SCF + FunLang (partially lowered)
↓ (FunLangToLLVMPass - for nil/cons/closure/apply)
SCF + LLVM
↓ (SCFToControlFlowPass)
CF + LLVM
↓ (ControlFlowToLLVMPass)
LLVM Dialect only
↓ (LLVMToObjectPass)
Machine code
Chapter 19 Goals
이 장에서 배울 것:
-
Match Operation Definition (Part 1)
- Region-based operation structure
- TableGen definition with VariadicRegion
- Custom verifier for region semantics
- YieldOp terminator for match results
- C API shim for region-based operations
- F# bindings with builder callback pattern
-
SCF Lowering (Part 2)
- SCF dialect overview and
scf.index_switch - MatchOpLowering pattern implementation
- Region cloning and type conversion
- Block argument remapping
- Common errors and debugging strategies
- SCF dialect overview and
-
End-to-End Example
- length function: complete compilation pipeline
- Stage-by-stage IR transformation
- Performance comparison vs naive approach
Success criteria:
- ✅
funlang.matchoperation defined and verified - ✅ Lowering to
scf.index_switchworking - ✅ Pattern variables bound via block arguments
- ✅ End-to-end compilation of recursive list functions
Let’s begin!
Part 1: Match Operation Definition
Region-Based Operations: The Foundation
Region이란 무엇인가?
MLIR에서 region은 basic blocks의 container다.
Region
├─ Block 1 (entry block)
│ ├─ Operation 1
│ ├─ Operation 2
│ └─ Terminator
├─ Block 2
│ └─ ...
└─ Block N
우리가 이미 본 region-based operations:
Chapter 8에서 scf.if:
scf.if %cond : i1 -> i32 {
// "then" region (1 block)
%x = arith.constant 42 : i32
scf.yield %x : i32
} else {
// "else" region (1 block)
%y = arith.constant 0 : i32
scf.yield %y : i32
}
scf.if는 2개의 regions (then, else)- 각 region은 exactly 1 block
- 각 block은
scf.yieldterminator로 끝남
Chapter 10에서 func.func:
func.func @my_function(%arg: i32) -> i32 {
// function body region (1 or more blocks)
%result = arith.addi %arg, %arg : i32
return %result : i32
}
func.func는 1개의 region (body)- Region은 1개 이상의 blocks (control flow로 여러 block 가능)
- Entry block은 function arguments as block arguments
왜 basic blocks이 아니라 regions인가?
Scenario: match expression with 3 cases
// F# code
match shape with
| Circle r -> compute_circle_area r
| Rectangle (w, h) -> compute_rectangle_area w h
| Triangle (a, b, c) -> compute_triangle_area a b c
Option 1: Basic blocks (NOT what we do)
// 잘못된 접근: basic blocks only
func.func @match_shape(%shape: !funlang.shape) -> f32 {
// ... tag extraction ...
cf.br ^dispatch
^dispatch:
cf.switch %tag [
^circle,
^rectangle,
^triangle
]
^circle:
// Circle case logic
cf.br ^exit
^rectangle:
// Rectangle case logic
cf.br ^exit
^triangle:
// Triangle case logic
cf.br ^exit
^exit(%result: f32):
return %result : f32
}
문제점:
- All blocks in same scope: ^circle, ^rectangle, ^triangle은 모두 같은 function body region
- No encapsulation: Case logic이 function CFG에 섞임
- Hard to verify: “각 case가 정확히 1개의 yield를 가지는가?” 검증 어려움
- Type conversion complexity: Lowering pass가 case blocks을 구분하기 어려움
Option 2: Regions (What we do)
// 올바른 접근: regions
func.func @match_shape(%shape: !funlang.shape) -> f32 {
%result = funlang.match %shape : !funlang.shape -> f32 {
^circle(%r: f32):
%area = call @compute_circle_area(%r) : (f32) -> f32
funlang.yield %area : f32
^rectangle(%w: f32, %h: f32):
%area = call @compute_rectangle_area(%w, %h) : (f32, f32) -> f32
funlang.yield %area : f32
^triangle(%a: f32, %b: f32, %c: f32):
%area = call @compute_triangle_area(%a, %b, %c) : (f32, f32, f32) -> f32
funlang.yield %area : f32
}
return %result : f32
}
장점:
- Encapsulation: 각 case가 자신만의 region (isolated scope)
- Clear structure: match operation이 모든 cases를 소유
- Easy verification: 각 region은 정확히 1 block, 1 terminator
- Lowering-friendly: Region 단위로 type conversion 수행 가능
Region vs Block vs Operation:
Operation: funlang.match
↓ has
Regions: [case 1 region, case 2 region, ...]
↓ each contains
Blocks: [entry block]
↓ contains
Operations: [arith.constant, func.call, funlang.yield, ...]
Match Operation Semantics
funlang.match의 의미론:
%result = funlang.match %input : InputType -> ResultType {
^case1(...pattern_vars1...):
// case 1 logic
funlang.yield %value1 : ResultType
^case2(...pattern_vars2...):
// case 2 logic
funlang.yield %value2 : ResultType
...
}
Execution semantics:
- Input evaluation:
%input값을 runtime에 evaluate - Tag extraction: Tagged union에서 tag value 추출
- Case selection: Tag에 따라 해당 region 선택
- Pattern variable binding: Region의 block arguments에 values 바인딩
- Case execution: 선택된 region 실행
- Result yielding: Region의
funlang.yield가%result에 값 전달
Example: sum_list
func.func @sum_list(%lst: !funlang.list<i32>) -> i32 {
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
%tail_sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
%sum = arith.addi %head, %tail_sum : i32
funlang.yield %sum : i32
}
return %result : i32
}
Runtime execution: sum_list([1, 2])
- Call:
@sum_list([1, 2]) - Tag extraction: Tag = 1 (Cons)
- Case selection: ^cons region
- Variable binding:
%head = 1,%tail = [2] - Recursive call:
@sum_list([2])- Tag = 1 (Cons)
%head = 2,%tail = []- Recursive call:
@sum_list([])- Tag = 0 (Nil)
- Return 0
%sum = 2 + 0 = 2- Return 2
- Final sum:
1 + 2 = 3 - Return: 3
Block Arguments for Pattern Variables
패턴 변수는 어떻게 바인딩되는가?
Chapter 2에서 우리는 block arguments를 배웠다:
^entry_block(%arg0: i32, %arg1: i32):
%sum = arith.addi %arg0, %arg1 : i32
Block arguments는 PHI nodes의 structured 대안이다.
Match operation에서 block arguments 활용:
funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
// Nil case: 패턴 변수 없음 → block arguments 없음
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
// Cons case: 2개 패턴 변수 → 2개 block arguments
// %head: i32 → cons cell의 head field
// %tail: !funlang.list<i32> → cons cell의 tail field
funlang.yield %sum : i32
}
Lowering이 block arguments를 채우는 방법:
// funlang.match lowering 후 (pseudo-code)
%tag = // extract tag from %lst
scf.index_switch %tag {
case 0 { // Nil case
// No data to extract, no arguments
%zero = arith.constant 0 : i32
scf.yield %zero : i32
}
case 1 { // Cons case
// Extract head and tail from cons cell
%head = // extract field 0 from data pointer
%tail = // extract field 1 from data pointer
// Now pass to the ^cons block's body (with arguments bound)
^cons(%head, %tail):
// User code here
}
}
실제로는 region을 clone하고 IRMapping으로 arguments를 remap한다 (Part 2에서 자세히)
Block arguments vs Let bindings:
// Option 1: Block arguments (what we do)
^cons(%head: i32, %tail: !funlang.list<i32>):
%sum = arith.addi %head, ... : i32
// Option 2: Let-style extraction (what we DON'T do)
^cons:
%head = funlang.extract_head %lst : !funlang.list<i32> -> i32
%tail = funlang.extract_tail %lst : !funlang.list<i32> -> !funlang.list<i32>
%sum = arith.addi %head, ... : i32
Block arguments가 더 나은 이유:
- Declarative: Pattern structure가 arguments에 직접 반영
- SSA-friendly: Block entry에서 values가 이미 available
- No redundant ops: extract operations 불필요
- Verification: Argument types로 pattern structure 검증 가능
TableGen Definition: MatchOp
이제 funlang.match operation의 TableGen 정의를 작성한다.
File: FunLang/FunLangOps.td (conceptual, 실제로는 C++ codebase)
def FunLang_MatchOp : FunLang_Op<"match", [
RecursiveSideEffect,
SingleBlockImplicitTerminator<"YieldOp">
]> {
let summary = "Pattern matching operation";
let description = [{
The `funlang.match` operation performs pattern matching on a value.
Each case is represented as a separate region with exactly one block.
The entry block of each region may have arguments corresponding to
pattern variables. For example, a Cons case has two arguments:
the head element and the tail list.
Each region must terminate with a `funlang.yield` operation that
returns a value of the result type.
Example:
```mlir
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
%sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
%result = arith.addi %head, %sum : i32
funlang.yield %result : i32
}
```
}];
let arguments = (ins AnyType:$input);
let results = (outs AnyType:$result);
let regions = (region VariadicRegion<SizedRegion<1>>:$cases);
let hasCustomAssemblyFormat = 1;
let hasVerifier = 1;
}
핵심 요소 설명:
1. Traits: RecursiveSideEffect
RecursiveSideEffect
의미: 이 operation의 side effects는 내부 regions의 operations에 의존한다.
왜 필요한가?
MLIR optimizer는 side effects를 분석하여 dead code elimination, common subexpression elimination 등을 수행한다.
funlang.nil은Puretrait → no side effectsfunlang.cons는 side effects 있음 (GC allocation)
Match operation은?
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%x = funlang.nil : !funlang.list<i32> // Pure
funlang.yield %zero : i32
^cons(%h, %t):
%y = funlang.cons %h, %t : ... // Side effect!
funlang.yield %sum : i32
}
- Nil case: no side effects
- Cons case: side effect (funlang.cons)
RecursiveSideEffect trait는 MLIR에게 말한다:
“이 operation의 side effects는 내부 regions을 재귀적으로 분석해서 결정해라”
없으면 어떻게 되나?
- Conservative assumption: match는 항상 side effects 있음
- Optimizer가 legitimate optimizations를 못함
- 예: dead match elimination 불가
2. Traits: SingleBlockImplicitTerminator
SingleBlockImplicitTerminator<"YieldOp">
의미: 각 region은 정확히 1개의 block을 가지며, 그 block은 YieldOp로 끝나야 한다.
검증 자동화:
이 trait가 있으면 MLIR이 자동으로 검증:
- 각 region이 정확히 1 block인가?
- 그 block이
funlang.yield로 끝나는가?
없으면 어떻게 되나?
Custom verifier에서 수동 검증 필요:
// Without the trait (manual verification)
LogicalResult MatchOp::verify() {
for (Region& region : getCases()) {
if (!region.hasOneBlock()) {
return emitError("each case must have exactly one block");
}
Block& block = region.front();
if (!isa<YieldOp>(block.getTerminator())) {
return emitError("each case must terminate with funlang.yield");
}
}
return success();
}
Trait가 이 boilerplate를 제거한다!
3. Regions: VariadicRegion<SizedRegion<1>>
let regions = (region VariadicRegion<SizedRegion<1>>:$cases);
분해:
VariadicRegion: 가변 개수의 regions (Nil/Cons = 2개, 더 많은 patterns = N개)SizedRegion<1>: 각 region은 정확히 1개의 block:$cases: C++ accessor name →getCases()method
대안들과 비교:
| Declaration | Meaning |
|---|---|
region AnyRegion:$body | Exactly 1 region, any number of blocks |
region SizedRegion<1>:$body | Exactly 1 region, exactly 1 block |
region VariadicRegion<AnyRegion>:$cases | N regions, each with any blocks |
region VariadicRegion<SizedRegion<1>>:$cases | N regions, each with 1 block ✅ |
scf.if와 비교:
// scf.if (exactly 2 regions)
def SCF_IfOp : ... {
let regions = (region SizedRegion<1>:$thenRegion,
SizedRegion<1>:$elseRegion);
}
// funlang.match (variable number of regions)
def FunLang_MatchOp : ... {
let regions = (region VariadicRegion<SizedRegion<1>>:$cases);
}
4. Custom Assembly Format
let hasCustomAssemblyFormat = 1;
이유: Generic format은 readable하지 않다.
Generic format (자동 생성):
%result = "funlang.match"(%lst) ({
^bb0:
%zero = arith.constant 0 : i32
"funlang.yield"(%zero) : (i32) -> ()
}, {
^bb0(%head: i32, %tail: !funlang.list<i32>):
%sum = arith.addi %head, %tail_sum : i32
"funlang.yield"(%sum) : (i32) -> ()
}) : (!funlang.list<i32>) -> i32
Custom format (우리가 작성):
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
%sum = arith.addi %head, %tail_sum : i32
funlang.yield %sum : i32
}
Custom parser/printer 구현 필요 (C++ code):
// File: FunLangOps.cpp
void MatchOp::print(OpAsmPrinter& p) {
p << " " << getInput() << " : " << getInput().getType()
<< " -> " << getResult().getType() << " ";
p.printRegion(getCases(), /*printEntryBlockArgs=*/true);
}
ParseResult MatchOp::parse(OpAsmParser& parser, OperationState& result) {
OpAsmParser::UnresolvedOperand input;
Type inputType, resultType;
Region* casesRegion = result.addRegion();
if (parser.parseOperand(input) ||
parser.parseColon() ||
parser.parseType(inputType) ||
parser.parseArrow() ||
parser.parseType(resultType) ||
parser.parseRegion(*casesRegion, /*arguments=*/{}, /*argTypes=*/{}))
return failure();
result.addTypes(resultType);
return success();
}
실제 구현은 더 복잡하지만, F# tutorial에서는 C API로 추상화됨
5. Custom Verifier
let hasVerifier = 1;
검증할 내용:
- ✅ Region count > 0
- ✅ 각 region의 block arguments types 검증
- ✅ 각 region의 yield type이 result type과 일치
- ✅ Input type이 matchable type (현재는 !funlang.list
)
C++ verifier implementation:
// File: FunLangOps.cpp
LogicalResult MatchOp::verify() {
// Check: at least one case
if (getCases().empty()) {
return emitError("match must have at least one case");
}
Type resultType = getResult().getType();
// Check each case region
for (Region& region : getCases()) {
if (region.empty())
return emitError("case region cannot be empty");
Block& block = region.front();
// Verify terminator (already checked by SingleBlockImplicitTerminator)
auto yieldOp = dyn_cast<YieldOp>(block.getTerminator());
if (!yieldOp)
return emitError("case must terminate with funlang.yield");
// Verify yield type matches result type
if (yieldOp.getValue().getType() != resultType) {
return emitError("yield type ")
<< yieldOp.getValue().getType()
<< " does not match result type " << resultType;
}
}
return success();
}
실전 예제:
// ERROR: No cases
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
}
// Error: match must have at least one case
// ERROR: Type mismatch
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%x = arith.constant 3.14 : f32 // Wrong type!
funlang.yield %x : f32
}
// Error: yield type f32 does not match result type i32
YieldOp: Match Result Terminator
각 match case는 funlang.yield로 끝나야 한다.
TableGen definition:
def FunLang_YieldOp : FunLang_Op<"yield", [
Terminator,
HasParent<"MatchOp">
]> {
let summary = "Yield a value from a match case";
let description = [{
The `funlang.yield` operation terminates a match case region and
returns a value to the parent `funlang.match` operation.
Example:
```mlir
funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32 // Yield from nil case
^cons(%h, %t):
%sum = arith.addi %h, ... : i32
funlang.yield %sum : i32 // Yield from cons case
}
```
}];
let arguments = (ins AnyType:$value);
let results = (outs);
let assemblyFormat = "$value attr-dict `:` type($value)";
}
핵심 요소:
1. Trait: Terminator
Terminator
의미: 이 operation은 basic block을 종료한다.
Block의 terminator 규칙:
- 모든 block은 정확히 1개의 terminator로 끝나야 함
- Terminator는 block의 마지막 operation이어야 함
- Terminator 예:
func.return,cf.br,scf.yield,funlang.yield
2. Trait: HasParent<“MatchOp”>
HasParent<"MatchOp">
의미: 이 operation은 MatchOp의 region 내에서만 사용 가능.
검증 자동화:
// OK: inside funlang.match
funlang.match %lst {
^nil:
funlang.yield %zero : i32 // ✅
}
// ERROR: outside match
func.func @wrong() -> i32 {
%x = arith.constant 42 : i32
funlang.yield %x : i32 // ❌ Error: funlang.yield must be inside MatchOp
}
3. Assembly Format
let assemblyFormat = "$value attr-dict `:` type($value)";
생성되는 format:
funlang.yield %sum : i32
Generic format과 비교:
// Generic (verbose)
"funlang.yield"(%sum) : (i32) -> ()
// Custom (readable)
funlang.yield %sum : i32
scf.yield와 비교
MLIR에는 여러 yield operations이 있다:
| Operation | Parent | Purpose |
|---|---|---|
scf.yield | scf.if, scf.for, scf.while | SCF control flow |
funlang.yield | funlang.match | FunLang pattern matching |
affine.yield | affine.for, affine.if | Affine loops |
왜 scf.yield를 재사용하지 않는가?
Option 1: 재사용 (하지 않음)
funlang.match %lst {
^nil:
scf.yield %zero : i32 // Reuse scf.yield?
}
문제:
-
Trait conflict:
scf.yield는HasParent<"IfOp", "ForOp", ...>MatchOp이 parent list에 없으면 verifier 실패- SCF dialect 수정 필요 (bad coupling)
-
Semantic confusion:
scf.yield는 SCF dialect semantics- Lowering pass에서
scf.yield처리 시 match context 고려해야 함 - Separation of concerns 위반
- Lowering pass에서
Option 2: 전용 operation (우리가 하는 것)
funlang.match %lst {
^nil:
funlang.yield %zero : i32 // FunLang-specific yield
}
장점:
- Clear ownership: FunLang dialect이 자신의 terminators 소유
- Lowering flexibility:
funlang.yield→scf.yield변환을 명시적으로 제어 - Future extensions: 나중에
funlang.yield에 attributes 추가 가능
C API and F# Integration
Region-based operations는 C API 설계가 복잡하다.
문제: Regions를 어떻게 F#에서 구축하는가?
Simple operations (Chapter 15):
// funlang.cons: no regions, straightforward
let cons =
FunLangOps.CreateConsOp(builder, head, tail, listType)
Region-based operations (Chapter 19):
// funlang.match: multiple regions, complex
let matchOp = FunLangOps.CreateMatchOp(builder, input, resultType, [
// How to build regions here???
nilRegion;
consRegion
])
Challenge:
- Region 구축은 F# side에서 일어나야 함 (pattern cases logic)
- 하지만 MLIR C++ API를 직접 호출할 수 없음 (C API만 가능)
- Builder callback pattern 필요!
C API Shim: Builder Callback Pattern
Pattern: C API가 F# callback을 받아서 region을 채움
C API shim (C wrapper):
// File: FunLang-C/FunLangOps.h
typedef void (*FunLangMatchCaseBuilder)(
MlirOpBuilder builder,
MlirBlock block,
void* userData
);
typedef struct {
FunLangMatchCaseBuilder builder;
void* userData;
} FunLangMatchCase;
MLIR_CAPI_EXPORTED MlirOperation funlangMatchOpCreate(
MlirOpBuilder builder,
MlirLocation loc,
MlirValue input,
MlirType resultType,
FunLangMatchCase* cases,
intptr_t numCases
);
Implementation (C++):
// File: FunLang-C/FunLangOps.cpp
MlirOperation funlangMatchOpCreate(
MlirOpBuilder builder,
MlirLocation loc,
MlirValue input,
MlirType resultType,
FunLangMatchCase* cases,
intptr_t numCases
) {
OpBuilder& cppBuilder = unwrap(builder);
Location cppLoc = unwrap(loc);
Value cppInput = unwrap(input);
Type cppResultType = unwrap(resultType);
// Create match operation
auto matchOp = cppBuilder.create<MatchOp>(
cppLoc, cppResultType, cppInput, numCases);
// Build each case region
for (intptr_t i = 0; i < numCases; ++i) {
Region& region = matchOp.getCases()[i];
Block* block = cppBuilder.createBlock(®ion);
// Invoke F# callback to populate the block
MlirBlock wrappedBlock = wrap(block);
cases[i].builder(builder, wrappedBlock, cases[i].userData);
}
return wrap(matchOp.getOperation());
}
핵심 아이디어:
- C API가 empty regions를 가진
MatchOp생성 - 각 region에 대해 F# callback 호출
- F# callback이 region의 block을 채움 (operations + yield)
F# Bindings
Low-level binding:
// File: FunLang.Interop/FunLangOps.fs
type MatchCaseBuilder =
MlirOpBuilder -> MlirBlock -> nativeint -> unit
[<Struct>]
type MatchCase =
val Builder: MatchCaseBuilder
val UserData: nativeint
new(builder, userData) =
{ Builder = builder; UserData = userData }
[<DllImport("FunLang-C", CallingConvention = CallingConvention.Cdecl)>]
extern MlirOperation funlangMatchOpCreate(
MlirOpBuilder builder,
MlirLocation loc,
MlirValue input,
MlirType resultType,
MatchCase[] cases,
nativeint numCases
)
High-level wrapper:
// File: FunLang.Compiler/OpBuilder.fs
type OpBuilder with
member this.CreateMatchOp(
input: MlirValue,
resultType: MlirType,
buildCases: (OpBuilder -> Block -> unit) list
) : MlirOperation =
// Convert F# functions to C callbacks
let cases =
buildCases
|> List.map (fun buildCase ->
let callback builder block userData =
let opBuilder = new OpBuilder(builder)
let mlirBlock = new Block(block)
buildCase opBuilder mlirBlock
MatchCase(callback, 0n)
)
|> List.toArray
let numCases = nativeint cases.Length
let loc = this.UnknownLoc()
funlangMatchOpCreate(
this.Handle,
loc,
input,
resultType,
cases,
numCases
)
사용 예제 (F# compiler code):
// File: FunLang.Compiler/Codegen.fs
let compileMatch (builder: OpBuilder) (scrutinee: MlirValue) (cases: MatchCase list) =
let resultType = // infer from cases
let buildCases =
cases |> List.map (fun case ->
fun (builder: OpBuilder) (block: Block) ->
match case with
| NilCase expr ->
// Build nil case body
let value = compileExpr builder env expr
builder.CreateYieldOp(value) |> ignore
| ConsCase (headVar, tailVar, expr) ->
// Add block arguments for head and tail
let headType = builder.GetIntegerType(32)
let tailType = builder.GetFunLangListType(headType)
block.AddArgument(headType) |> ignore
block.AddArgument(tailType) |> ignore
// Build cons case body with extended environment
let env' =
env
|> Map.add headVar (block.GetArgument(0))
|> Map.add tailVar (block.GetArgument(1))
let value = compileExpr builder env' expr
builder.CreateYieldOp(value) |> ignore
)
builder.CreateMatchOp(scrutinee, resultType, buildCases)
Generated MLIR:
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
%tail_sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
%sum = arith.addi %head, %tail_sum : i32
funlang.yield %sum : i32
}
Builder callback pattern의 장점:
- Flexibility: F# code가 region 내용을 완전히 제어
- Type safety: F# compiler가 callback signature 검증
- Composability: Nested match expressions 지원 (callback 안에서 또 match 생성)
Block Arguments in Builder Callback
위 코드에서 중요한 부분:
| ConsCase (headVar, tailVar, expr) ->
// Add block arguments for pattern variables
block.AddArgument(headType) |> ignore
block.AddArgument(tailType) |> ignore
// Use block arguments in environment
let env' =
env
|> Map.add headVar (block.GetArgument(0))
|> Map.add tailVar (block.GetArgument(1))
F# callback이 하는 일:
- Pattern structure 분석: ConsCase는 2개 변수 (head, tail)
- Block arguments 추가: Cons case block에 2개 arguments
- Environment extension: Pattern variables를 block arguments로 바인딩
- Body compilation: Extended environment로 case expression 컴파일
Lowering pass의 책임:
Lowering pass는 이 block arguments를 실제 데이터로 채운다:
// MatchOpLowering (Part 2에서 자세히)
// 1. Extract head and tail from cons cell
Value head = extractHead(builder, consCellPtr);
Value tail = extractTail(builder, consCellPtr);
// 2. Clone cons region
IRMapping mapper;
mapper.map(consBlock->getArgument(0), head); // Map %head
mapper.map(consBlock->getArgument(1), tail); // Map %tail
// 3. Clone operations with mapped values
for (Operation& op : consBlock->getOperations()) {
builder.clone(op, mapper);
}
결과: Block arguments가 실제 values로 대체됨
중간 정리: Part 1 완료
Part 1에서 다룬 내용:
✅ Region-based operation structure
- Regions vs basic blocks
- Encapsulation과 verification 장점
✅ Match operation semantics
- Runtime execution model
- Tag extraction → case selection → variable binding → yield
✅ TableGen definition
- Traits: RecursiveSideEffect, SingleBlockImplicitTerminator
- VariadicRegion<SizedRegion<1>> for variable cases
- Custom assembly format과 verifier
✅ YieldOp terminator
- Terminator trait
- HasParent<“MatchOp”> constraint
- Comparison with scf.yield
✅ C API and F# integration
- Builder callback pattern
- High-level wrapper for match construction
- Block arguments for pattern variables
다음 Part 2에서:
- SCF dialect 상세 설명
- MatchOpLowering pattern 완전 구현
- Region cloning과 IRMapping
- 전체 pipeline 예제 (sum_list)
- Common errors와 debugging
Part 2: SCF Lowering and Pipeline
SCF Dialect: Structured Control Flow
SCF = Structured Control Flow
Chapter 8에서 우리는 scf.if를 사용했다:
%result = scf.if %cond : i1 -> i32 {
%then_val = arith.constant 42 : i32
scf.yield %then_val : i32
} else {
%else_val = arith.constant 0 : i32
scf.yield %else_val : i32
}
SCF dialect의 핵심 operations:
| Operation | Purpose | Regions |
|---|---|---|
scf.if | Two-way branch | 2 (then, else) |
scf.index_switch | Multi-way branch | N (cases) + default |
scf.for | Counted loop | 1 (body) |
scf.while | Conditional loop | 2 (before, after) |
Chapter 19에서는 scf.index_switch를 사용한다.
scf.index_switch: Multi-Way Branching
Syntax:
%result = scf.index_switch %selector : index -> ResultType
case 0 {
// Case 0 operations
scf.yield %value0 : ResultType
}
case 1 {
// Case 1 operations
scf.yield %value1 : ResultType
}
default {
// Default case (optional)
scf.yield %default_val : ResultType
}
Semantics:
- Selector evaluation:
%selector값을 runtime에 evaluate (index type) - Case selection: Selector 값에 해당하는 case region 선택
- Fallback: 해당하는 case가 없으면 default region (있다면)
- Result yielding: 선택된 region의
scf.yield가 결과 전달
Example: Tag dispatch for list
// %lst: !funlang.list<i32>
// Tag extraction
%struct = // convert %lst to !llvm.struct<(i32, ptr)>
%tag = llvm.extractvalue %struct[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %tag : i32 to index
// Dispatch on tag
%result = scf.index_switch %tag_index : index -> i32
case 0 { // Nil case (tag = 0)
%zero = arith.constant 0 : i32
scf.yield %zero : i32
}
case 1 { // Cons case (tag = 1)
%ptr = llvm.extractvalue %struct[1] : !llvm.struct<(i32, ptr)>
%head = llvm.load %ptr : !llvm.ptr -> i32
// ... compute with head ...
scf.yield %sum : i32
}
default {
// Unreachable for {Nil, Cons} (complete constructor set)
%minus = arith.constant -1 : i32
scf.yield %minus : i32
}
Why SCF Before LLVM?
Option 1: Direct lowering funlang.match → LLVM (what we DON’T do)
// Directly to LLVM dialect
%tag = llvm.extractvalue ...
llvm.switch %tag [
0: ^nil_block,
1: ^cons_block
]
^nil_block:
// ... operations ...
llvm.br ^merge_block(%zero)
^cons_block:
// ... operations ...
llvm.br ^merge_block(%sum)
^merge_block(%result: i32):
llvm.return %result
문제점:
- Lost structure: CFG는 원래 match의 case structure를 상실
- Harder optimization: Which blocks belong to which case? 불명확
- Debugging: LLVM IR에서 source pattern matching 추적 어려움
- Lowering complexity: funlang.match → LLVM을 한 번에 구현해야 함
Option 2: Progressive lowering funlang.match → SCF → CF → LLVM (what we do)
// Stage 1: FunLang
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil: funlang.yield %zero : i32
^cons(%h, %t): funlang.yield %sum : i32
}
// Stage 2: SCF (structured, high-level)
%tag_index = // extract tag and cast to index
%result = scf.index_switch %tag_index : index -> i32
case 0 { scf.yield %zero : i32 }
case 1 { scf.yield %sum : i32 }
// Stage 3: CF (goto-style, low-level)
cf.switch %tag_index [
0: ^block_0,
1: ^block_1
]
^block_0: cf.br ^merge(%zero)
^block_1: cf.br ^merge(%sum)
^merge(%result: i32): ...
// Stage 4: LLVM (machine-level)
llvm.switch %tag_i8 [
0: ^llvm_0,
1: ^llvm_1
]
// ... LLVM blocks ...
장점:
- Separation of concerns: 각 lowering pass는 하나의 변환만 책임
- Optimization hooks: SCF level에서 pattern-specific optimizations
- Incremental verification: 각 stage마다 IR 검증 가능
- Easier debugging: 문제 발생 시 어느 stage에서 일어났는지 명확
Comparison: SCF vs CF
| Aspect | SCF | CF |
|---|---|---|
| Structure | Nested regions | Flat blocks |
| Control flow | Implicit (yield returns) | Explicit (br/switch) |
| Source mapping | Preserves match structure | Lost |
| Optimization | High-level (dead case elimination) | Low-level (block merging) |
| Readability | High (similar to source) | Low (machine-like) |
Example: Dead case elimination at SCF level
// Input: match on statically-known value
%nil = funlang.nil : !funlang.list<i32>
%result = funlang.match %nil : !funlang.list<i32> -> i32 {
^nil: funlang.yield %zero : i32
^cons(%h, %t): funlang.yield %sum : i32 // Dead!
}
// After lowering to SCF
%tag_index = arith.constant 0 : index // Statically known!
%result = scf.index_switch %tag_index : index -> i32
case 0 { scf.yield %zero : i32 }
case 1 { scf.yield %sum : i32 } // Dead case!
// SCF optimizer can eliminate case 1
%result = scf.index_switch %tag_index : index -> i32
case 0 { scf.yield %zero : i32 }
// case 1 removed
// Further optimization: constant folding
%result = %zero // Direct replacement!
이런 최적화는 CF level에서는 훨씬 어렵다.
MatchOp Lowering Strategy
Goal: funlang.match → scf.index_switch 변환
Input (FunLang):
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
%tail_sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
%sum = arith.addi %head, %tail_sum : i32
funlang.yield %sum : i32
}
Output (SCF + LLVM):
// 1. Convert list type to struct
%struct = builtin.unrealized_conversion_cast %lst
: !funlang.list<i32> to !llvm.struct<(i32, ptr)>
// 2. Extract tag
%tag_i32 = llvm.extractvalue %struct[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %tag_i32 : i32 to index
// 3. Extract data pointer (for cons case)
%data_ptr = llvm.extractvalue %struct[1] : !llvm.struct<(i32, ptr)>
// 4. Multi-way switch
%result = scf.index_switch %tag_index : index -> i32
case 0 {
// Nil case: no data to extract
%zero = arith.constant 0 : i32
scf.yield %zero : i32
}
case 1 {
// Cons case: extract head and tail
%head_ptr = %data_ptr // Points to [head, tail] array
%head = llvm.load %head_ptr : !llvm.ptr -> i32
%tail_ptr = llvm.getelementptr %data_ptr[1] : (!llvm.ptr) -> !llvm.ptr
%tail_struct_ptr = llvm.load %tail_ptr : !llvm.ptr -> !llvm.ptr
%tail_struct = llvm.load %tail_struct_ptr : !llvm.ptr -> !llvm.struct<(i32, ptr)>
%tail = builtin.unrealized_conversion_cast %tail_struct
: !llvm.struct<(i32, ptr)> to !funlang.list<i32>
// Cons case body (converted)
%tail_sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
%sum = arith.addi %head, %tail_sum : i32
scf.yield %sum : i32
}
default {
// Unreachable for {Nil, Cons}
%minus = arith.constant -1 : i32
scf.yield %minus : i32
}
Lowering steps:
- Type conversion:
!funlang.list<T>→!llvm.struct<(i32, ptr)> - Tag extraction:
llvm.extractvalueto get tag field - Index casting:
arith.index_castforscf.index_switchselector - Case region cloning: 각 funlang.match case를 scf.index_switch case로 복사
- Block argument mapping: Pattern variables를 extracted values로 대체
- Terminator conversion:
funlang.yield→scf.yield
Tag Value Mapping
Chapter 18 recap: List representation
// NilOpLowering
Value tag = builder.create<arith::ConstantIntOp>(loc, 0, builder.getI32Type());
// ConsOpLowering
Value tag = builder.create<arith::ConstantIntOp>(loc, 1, builder.getI32Type());
Tag mapping:
| Constructor | Tag Value |
|---|---|
| Nil | 0 |
| Cons | 1 |
MatchOpLowering은 이 mapping을 알아야 한다:
// In MatchOpLowering::matchAndRewrite
// Case 0 → Nil pattern
// Case 1 → Cons pattern
for (auto [index, region] : llvm::enumerate(matchOp.getCases())) {
// index = 0 → Nil
// index = 1 → Cons
builder.create<scf::IndexSwitchCaseOp>(loc, index);
// ... clone region ...
}
Future extension: 임의의 ADT
지금은 hardcoded mapping (Nil=0, Cons=1)이지만, 나중에는:
// Extensible ADT definition
def Shape : FunLang_ADT<"shape"> {
let constructors = [
Constructor<"circle", [F32]>, // tag = 0
Constructor<"rectangle", [F32, F32]>, // tag = 1
Constructor<"triangle", [F32, F32, F32]> // tag = 2
];
}
Compiler가 자동으로 tag 할당.
Pattern Variable Binding
Cons case의 challenge: block arguments를 어떻게 채우는가?
Source (FunLang):
^cons(%head: i32, %tail: !funlang.list<i32>):
// %head와 %tail이 어디서 오는가?
funlang.yield %sum : i32
Lowering 후 (SCF):
case 1 {
// 여기서 %head와 %tail을 extract해야 함
%head = llvm.load %data_ptr : !llvm.ptr -> i32
%tail = // ... complex extraction ...
// 이제 body를 clone하면서 block arguments를 이 values로 map
// (IRMapping 사용)
}
IRMapping: SSA Value Remapping
MLIR의 IRMapping class는 “old value → new value” mapping을 저장한다.
IRMapping mapper;
mapper.map(oldValue1, newValue1);
mapper.map(oldValue2, newValue2);
// Clone operation with mapped values
Operation* newOp = builder.clone(*oldOp, mapper);
// oldOp의 operands가 oldValue1, oldValue2였다면
// newOp의 operands는 newValue1, newValue2로 대체됨
MatchOpLowering에서 IRMapping 사용:
// Cons case region
Region& consRegion = matchOp.getCases()[1];
Block* consBlock = &consRegion.front();
// consBlock의 block arguments:
// consBlock->getArgument(0) = %head (i32)
// consBlock->getArgument(1) = %tail (!funlang.list<i32>)
// Extract actual values
Value actualHead = extractHead(builder, dataPtrConverted);
Value actualTail = extractTail(builder, dataPtrConverted, typeConverter);
// Map block arguments to extracted values
IRMapping mapper;
mapper.map(consBlock->getArgument(0), actualHead);
mapper.map(consBlock->getArgument(1), actualTail);
// Clone operations in consBlock with mapping
for (Operation& op : consBlock->getOperations()) {
if (isa<YieldOp>(op)) {
// Convert funlang.yield → scf.yield
builder.create<scf::YieldOp>(op.getLoc(),
mapper.lookupOrDefault(op.getOperand(0)));
} else {
// Clone other operations with mapped operands
builder.clone(op, mapper);
}
}
Result: Block arguments가 사라지고 extracted values로 대체됨
// Before (funlang.match case)
^cons(%head: i32, %tail: !funlang.list<i32>):
%sum = arith.addi %head, %tail_sum : i32
funlang.yield %sum : i32
// After (scf.index_switch case)
case 1 {
%head = llvm.load ... // Extracted value
%tail = ... // Extracted value
%sum = arith.addi %head, %tail_sum : i32 // %head mapped
scf.yield %sum : i32
}
MatchOpLowering Pattern: Complete Implementation
이제 전체 lowering pattern을 구현한다.
File: FunLang/Transforms/FunLangToSCF.cpp (conceptual C++ code)
#include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h"
#include "mlir/Dialect/Arith/IR/Arith.h"
#include "mlir/Dialect/Func/IR/FuncOps.h"
#include "mlir/Dialect/LLVM/LLVMDialect.h"
#include "mlir/Dialect/SCF/IR/SCF.h"
#include "mlir/IR/IRMapping.h"
#include "mlir/Transforms/DialectConversion.h"
#include "FunLang/IR/FunLangOps.h"
#include "FunLang/Transforms/TypeConverter.h"
using namespace mlir;
using namespace mlir::funlang;
namespace {
// Helper: Extract head from cons cell
// Input: %data_ptr points to [head, tail] array
// Output: %head value
Value extractHead(OpBuilder& builder, Location loc,
Value dataPtrConverted, Type headType) {
// %data_ptr already points to cons cell array
// Load first element (head)
Value head = builder.create<LLVM::LoadOp>(loc, headType, dataPtrConverted);
return head;
}
// Helper: Extract tail from cons cell
// Input: %data_ptr points to [head, tail] array
// Output: %tail value (converted back to !funlang.list<T>)
Value extractTail(OpBuilder& builder, Location loc,
Value dataPtrConverted,
FunLangTypeConverter* typeConverter,
Type tailFunLangType) {
// GEP to second element (tail)
Value one = builder.create<arith::ConstantIntOp>(loc, 1, builder.getI32Type());
Value tailPtr = builder.create<LLVM::GEPOp>(
loc, LLVM::LLVMPointerType::get(builder.getContext()),
dataPtrConverted, ValueRange{one});
// Load tail pointer
Value tailStructPtr = builder.create<LLVM::LoadOp>(
loc, LLVM::LLVMPointerType::get(builder.getContext()), tailPtr);
// Load tail struct
Type tailStructType = typeConverter->convertType(tailFunLangType);
Value tailStruct = builder.create<LLVM::LoadOp>(
loc, tailStructType, tailStructPtr);
// Convert back to FunLang type (for remaining funlang operations in body)
Value tail = builder.create<UnrealizedConversionCastOp>(
loc, tailFunLangType, tailStruct).getResult(0);
return tail;
}
// Helper: Convert funlang.yield to scf.yield in region
void convertYieldOps(Region& region, OpBuilder& builder, IRMapping& mapper) {
for (Block& block : region) {
for (Operation& op : llvm::make_early_inc_range(block)) {
if (auto yieldOp = dyn_cast<YieldOp>(&op)) {
builder.setInsertionPoint(yieldOp);
Value yieldValue = mapper.lookupOrDefault(yieldOp.getValue());
builder.create<scf::YieldOp>(yieldOp.getLoc(), yieldValue);
yieldOp.erase();
}
}
}
}
// Main lowering pattern
class MatchOpLowering : public OpConversionPattern<MatchOp> {
public:
using OpConversionPattern<MatchOp>::OpConversionPattern;
LogicalResult matchAndRewrite(
MatchOp matchOp,
OpAdaptor adaptor,
ConversionPatternRewriter& rewriter) const override {
Location loc = matchOp.getLoc();
Value input = adaptor.getInput();
Type resultType = matchOp.getResult().getType();
auto* typeConverter = getTypeConverter<FunLangTypeConverter>();
// 1. Convert input to LLVM struct
// input: !funlang.list<T> → !llvm.struct<(i32, ptr)>
Type structType = typeConverter->convertType(input.getType());
Value structVal = rewriter.create<UnrealizedConversionCastOp>(
loc, structType, input).getResult(0);
// 2. Extract tag field
Value tag = rewriter.create<LLVM::ExtractValueOp>(loc, structVal, 0);
// 3. Cast tag to index (for scf.index_switch)
Value tagIndex = rewriter.create<arith::IndexCastOp>(
loc, rewriter.getIndexType(), tag);
// 4. Extract data pointer (needed for cons case)
Value dataPtr = rewriter.create<LLVM::ExtractValueOp>(loc, structVal, 1);
// 5. Create scf.index_switch
auto indexSwitchOp = rewriter.create<scf::IndexSwitchOp>(
loc, resultType, tagIndex, matchOp.getCases().size());
// 6. Process each case region
for (auto [caseIndex, caseRegion] :
llvm::enumerate(matchOp.getCases())) {
Block* originalBlock = &caseRegion.front();
Region& switchCaseRegion = indexSwitchOp.getCaseRegions()[caseIndex];
Block* caseBlock = rewriter.createBlock(&switchCaseRegion);
rewriter.setInsertionPointToStart(caseBlock);
IRMapping mapper;
// Handle block arguments (pattern variables)
if (caseIndex == 1) { // Cons case
// originalBlock has 2 arguments: %head, %tail
// Extract head
Type headFunLangType = originalBlock->getArgument(0).getType();
Type headLLVMType = typeConverter->convertType(headFunLangType);
Value head = extractHead(rewriter, loc, dataPtr, headLLVMType);
// Convert head to FunLang type if needed
Value headFunLang = head;
if (headFunLangType != headLLVMType) {
headFunLang = rewriter.create<UnrealizedConversionCastOp>(
loc, headFunLangType, head).getResult(0);
}
// Extract tail
Type tailFunLangType = originalBlock->getArgument(1).getType();
Value tail = extractTail(rewriter, loc, dataPtr,
typeConverter, tailFunLangType);
// Map block arguments to extracted values
mapper.map(originalBlock->getArgument(0), headFunLang);
mapper.map(originalBlock->getArgument(1), tail);
}
// Nil case (caseIndex == 0): no block arguments, no extraction
// Clone operations from original region
for (Operation& op : originalBlock->getOperations()) {
if (auto yieldOp = dyn_cast<YieldOp>(&op)) {
// Convert funlang.yield → scf.yield
Value yieldValue = mapper.lookupOrDefault(yieldOp.getValue());
rewriter.create<scf::YieldOp>(loc, yieldValue);
} else {
// Clone operation with mapped operands
rewriter.clone(op, mapper);
}
}
}
// 7. Add default region (unreachable for complete constructor sets)
{
Region& defaultRegion = indexSwitchOp.getDefaultRegion();
Block* defaultBlock = rewriter.createBlock(&defaultRegion);
rewriter.setInsertionPointToStart(defaultBlock);
// Emit error value (this should never execute)
Value errorVal;
if (resultType.isIntOrIndex()) {
errorVal = rewriter.create<arith::ConstantIntOp>(loc, -1, resultType);
} else {
// For other types, emit unreachable or null
errorVal = rewriter.create<LLVM::ZeroOp>(loc, resultType);
}
rewriter.create<scf::YieldOp>(loc, errorVal);
}
// 8. Replace match operation with index_switch result
rewriter.replaceOp(matchOp, indexSwitchOp.getResult(0));
return success();
}
};
} // namespace
// Pass definition
struct FunLangToSCFPass
: public PassWrapper<FunLangToSCFPass, OperationPass<ModuleOp>> {
void getDependentDialects(DialectRegistry& registry) const override {
registry.insert<arith::ArithDialect,
scf::SCFDialect,
LLVM::LLVMDialect>();
}
void runOnOperation() override {
auto module = getOperation();
auto* context = &getContext();
FunLangTypeConverter typeConverter;
ConversionTarget target(*context);
// Mark funlang.match as illegal (must be lowered)
target.addIllegalOp<MatchOp>();
// Mark SCF operations as legal
target.addLegalDialect<scf::SCFDialect>();
target.addLegalDialect<arith::ArithDialect>();
target.addLegalDialect<LLVM::LLVMDialect>();
target.addLegalDialect<func::FuncDialect>();
// Keep other FunLang ops legal (lowered in FunLangToLLVM pass)
target.addLegalOp<NilOp, ConsOp, ClosureOp, ApplyOp>();
RewritePatternSet patterns(context);
patterns.add<MatchOpLowering>(typeConverter, context);
if (failed(applyPartialConversion(module, target, std::move(patterns)))) {
signalPassFailure();
}
}
};
std::unique_ptr<Pass> createFunLangToSCFPass() {
return std::make_unique<FunLangToSCFPass>();
}
핵심 로직 분석:
1. Type conversion (lines ~95-100)
Type structType = typeConverter->convertType(input.getType());
Value structVal = rewriter.create<UnrealizedConversionCastOp>(
loc, structType, input).getResult(0);
!funlang.list<i32> → !llvm.struct<(i32, ptr)> 변환.
UnrealizedConversionCastOp는 type conversion의 placeholder다. 나중에 다른 pass가 이를 실제 operations로 대체하거나 제거한다.
2. Tag extraction (lines ~103-108)
Value tag = rewriter.create<LLVM::ExtractValueOp>(loc, structVal, 0);
Value tagIndex = rewriter.create<arith::IndexCastOp>(
loc, rewriter.getIndexType(), tag);
Struct의 첫 번째 field (tag)를 추출하고 index type으로 cast.
3. scf.index_switch creation (lines ~113-115)
auto indexSwitchOp = rewriter.create<scf::IndexSwitchOp>(
loc, resultType, tagIndex, matchOp.getCases().size());
N개의 cases를 가진 index_switch 생성.
4. Region cloning (lines ~118-160)
각 case region을 iterate하며:
- Nil case (caseIndex == 0): Block arguments 없음, 그냥 clone
- Cons case (caseIndex == 1): Block arguments 있음, extract + map
5. IRMapping for block arguments (lines ~130-148)
mapper.map(originalBlock->getArgument(0), headFunLang);
mapper.map(originalBlock->getArgument(1), tail);
Original block의 arguments를 extracted values로 mapping.
6. Operation cloning (lines ~152-159)
for (Operation& op : originalBlock->getOperations()) {
if (auto yieldOp = dyn_cast<YieldOp>(&op)) {
Value yieldValue = mapper.lookupOrDefault(yieldOp.getValue());
rewriter.create<scf::YieldOp>(loc, yieldValue);
} else {
rewriter.clone(op, mapper);
}
}
funlang.yield→scf.yield변환- 다른 operations는 mapper와 함께 clone
7. Default region (lines ~163-176)
Unreachable case를 위한 default region 생성.
Complete Pipeline: Pass Registration
전체 lowering pipeline:
FunLang Dialect (with match, nil, cons, closure, apply)
↓
[FunLangToSCFPass]
↓
FunLang (without match) + SCF
↓
[FunLangToLLVMPass] (lowers nil, cons, closure, apply)
↓
LLVM + SCF
↓
[SCFToControlFlowPass]
↓
LLVM + CF
↓
[ConvertControlFlowToLLVMPass]
↓
LLVM Dialect only
↓
[LLVMToObjectPass]
↓
Object file
Pass manager setup (F# code):
// File: FunLang.Compiler/Pipeline.fs
let lowerToLLVM (module_: Module) =
let pm = PassManager.Create(module_.Context)
// 1. FunLang → SCF (lower match operation)
pm.AddPass(FunLangPasses.CreateFunLangToSCFPass())
// 2. FunLang → LLVM (lower nil, cons, closure, apply)
pm.AddPass(FunLangPasses.CreateFunLangToLLVMPass())
// 3. SCF → CF
pm.AddPass(SCFPasses.CreateSCFToControlFlowPass())
// 4. CF → LLVM
pm.AddPass(ConversionPasses.CreateConvertControlFlowToLLVMPass())
// 5. Func → LLVM
pm.AddPass(ConversionPasses.CreateConvertFuncToLLVMPass())
// 6. Arith → LLVM
pm.AddPass(ConversionPasses.CreateConvertArithToLLVMPass())
pm.Run(module_) |> ignore
Pass dependencies:
-
FunLangToSCFPassmust run beforeFunLangToLLVMPass- Reason: MatchOp의 regions에 다른 FunLang ops (nil, cons, etc.) 포함
- SCF로 변환 후 남은 FunLang ops를 LLVM으로 변환
-
SCFToControlFlowPassmust run after all FunLang lowering- Reason: SCF ops는 다른 dialects가 모두 LLVM으로 변환된 후 lower
-
ConvertFuncToLLVMPassmust run after SCF/CF conversion- Reason: Function signatures에 FunLang types가 남아있으면 안 됨
End-to-End Example: sum_list Function
F# source code:
// FunLang source
let rec sum_list lst =
match lst with
| [] -> 0
| head :: tail -> head + sum_list tail
let main () =
let my_list = [1; 2; 3]
sum_list my_list
Stage 1: FunLang Dialect (after F# compiler)
module {
func.func @sum_list(%lst: !funlang.list<i32>) -> i32 {
%result = funlang.match %lst : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
%tail_sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
%sum = arith.addi %head, %tail_sum : i32
funlang.yield %sum : i32
}
return %result : i32
}
func.func @main() -> i32 {
// Build list [1, 2, 3]
%nil = funlang.nil : !funlang.list<i32>
%c3 = arith.constant 3 : i32
%l3 = funlang.cons %c3, %nil : (i32, !funlang.list<i32>) -> !funlang.list<i32>
%c2 = arith.constant 2 : i32
%l2 = funlang.cons %c2, %l3 : (i32, !funlang.list<i32>) -> !funlang.list<i32>
%c1 = arith.constant 1 : i32
%l1 = funlang.cons %c1, %l2 : (i32, !funlang.list<i32>) -> !funlang.list<i32>
// Call sum_list
%sum = func.call @sum_list(%l1) : (!funlang.list<i32>) -> i32
return %sum : i32
}
}
Stage 2: After FunLangToSCFPass
module {
func.func @sum_list(%lst: !funlang.list<i32>) -> i32 {
// Type conversion
%struct = builtin.unrealized_conversion_cast %lst
: !funlang.list<i32> to !llvm.struct<(i32, ptr)>
// Tag extraction
%tag = llvm.extractvalue %struct[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %tag : i32 to index
// Data pointer
%data_ptr = llvm.extractvalue %struct[1] : !llvm.struct<(i32, ptr)>
// Index switch
%result = scf.index_switch %tag_index : index -> i32
case 0 {
%zero = arith.constant 0 : i32
scf.yield %zero : i32
}
case 1 {
// Extract head
%head = llvm.load %data_ptr : !llvm.ptr -> i32
// Extract tail
%one = arith.constant 1 : i32
%tail_ptr = llvm.getelementptr %data_ptr[%one] : (!llvm.ptr, i32) -> !llvm.ptr
%tail_struct_ptr = llvm.load %tail_ptr : !llvm.ptr -> !llvm.ptr
%tail_struct = llvm.load %tail_struct_ptr : !llvm.ptr -> !llvm.struct<(i32, ptr)>
%tail = builtin.unrealized_conversion_cast %tail_struct
: !llvm.struct<(i32, ptr)> to !funlang.list<i32>
// Recursive call
%tail_sum = func.call @sum_list(%tail) : (!funlang.list<i32>) -> i32
// Sum
%sum = arith.addi %head, %tail_sum : i32
scf.yield %sum : i32
}
default {
%error = arith.constant -1 : i32
scf.yield %error : i32
}
return %result : i32
}
func.func @main() -> i32 {
// Still has funlang.nil and funlang.cons (not lowered yet)
%nil = funlang.nil : !funlang.list<i32>
%c3 = arith.constant 3 : i32
%l3 = funlang.cons %c3, %nil : (i32, !funlang.list<i32>) -> !funlang.list<i32>
// ...
%sum = func.call @sum_list(%l1) : (!funlang.list<i32>) -> i32
return %sum : i32
}
}
Stage 3: After FunLangToLLVMPass
module {
func.func @sum_list(%lst: !llvm.struct<(i32, ptr)>) -> i32 {
// ... same as Stage 2 but types converted ...
%tag = llvm.extractvalue %lst[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %tag : i32 to index
%data_ptr = llvm.extractvalue %lst[1] : !llvm.struct<(i32, ptr)>
%result = scf.index_switch %tag_index : index -> i32
case 0 { /* ... */ }
case 1 { /* ... */ }
default { /* ... */ }
return %result : i32
}
func.func @main() -> i32 {
// funlang.nil and funlang.cons lowered to LLVM
%c0 = arith.constant 0 : i32
%null = llvm.mlir.zero : !llvm.ptr
%undef_nil = llvm.mlir.undef : !llvm.struct<(i32, ptr)>
%s1_nil = llvm.insertvalue %c0, %undef_nil[0] : !llvm.struct<(i32, ptr)>
%nil = llvm.insertvalue %null, %s1_nil[1] : !llvm.struct<(i32, ptr)>
%c3 = arith.constant 3 : i32
%c1_tag = arith.constant 1 : i32
%size = arith.constant 16 : i64 // sizeof(cons cell)
%ptr = llvm.call @GC_malloc(%size) : (i64) -> !llvm.ptr
llvm.store %c3, %ptr : i32, !llvm.ptr
%tail_ptr = llvm.getelementptr %ptr[1] : (!llvm.ptr) -> !llvm.ptr
// ... store tail ...
%undef_cons = llvm.mlir.undef : !llvm.struct<(i32, ptr)>
%s1_cons = llvm.insertvalue %c1_tag, %undef_cons[0] : !llvm.struct<(i32, ptr)>
%l3 = llvm.insertvalue %ptr, %s1_cons[1] : !llvm.struct<(i32, ptr)>
// ...
%sum = func.call @sum_list(%l1) : (!llvm.struct<(i32, ptr)>) -> i32
return %sum : i32
}
}
Stage 4: After SCFToControlFlowPass
module {
func.func @sum_list(%lst: !llvm.struct<(i32, ptr)>) -> i32 {
%tag = llvm.extractvalue %lst[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %tag : i32 to index
%data_ptr = llvm.extractvalue %lst[1] : !llvm.struct<(i32, ptr)>
// scf.index_switch → cf.switch
cf.switch %tag_index : index, [
default: ^default,
0: ^case_0,
1: ^case_1
]
^case_0:
%zero = arith.constant 0 : i32
cf.br ^merge(%zero : i32)
^case_1:
%head = llvm.load %data_ptr : !llvm.ptr -> i32
// ... extract tail ...
%tail_sum = func.call @sum_list(%tail) : (!llvm.struct<(i32, ptr)>) -> i32
%sum = arith.addi %head, %tail_sum : i32
cf.br ^merge(%sum : i32)
^default:
%error = arith.constant -1 : i32
cf.br ^merge(%error : i32)
^merge(%result: i32):
return %result : i32
}
func.func @main() -> i32 {
// ... LLVM code for list construction ...
%sum = func.call @sum_list(%l1) : (!llvm.struct<(i32, ptr)>) -> i32
return %sum : i32
}
}
Stage 5: After ConvertControlFlowToLLVMPass + ConvertFuncToLLVMPass
llvm.func @sum_list(%arg0: !llvm.struct<(i32, ptr)>) -> i32 {
%0 = llvm.extractvalue %arg0[0] : !llvm.struct<(i32, ptr)>
%1 = llvm.sext %0 : i32 to i64 // index cast
%2 = llvm.extractvalue %arg0[1] : !llvm.struct<(i32, ptr)>
llvm.switch %1 : i64, ^default [
0: ^case_0,
1: ^case_1
]
^case_0:
%c0 = llvm.mlir.constant(0 : i32) : i32
llvm.br ^merge(%c0 : i32)
^case_1:
%head = llvm.load %2 : !llvm.ptr -> i32
// ... tail extraction ...
%tail_sum = llvm.call @sum_list(%tail) : (!llvm.struct<(i32, ptr)>) -> i32
%sum = llvm.add %head, %tail_sum : i32
llvm.br ^merge(%sum : i32)
^default:
%error = llvm.mlir.constant(-1 : i32) : i32
llvm.br ^merge(%error : i32)
^merge(%result: i32):
llvm.return %result : i32
}
llvm.func @main() -> i32 {
// ... LLVM code ...
%sum = llvm.call @sum_list(%l1) : (!llvm.struct<(i32, ptr)>) -> i32
llvm.return %sum : i32
}
Stage 6: Native code (after llc + linking)
$ ./funlang_program
6
Pipeline verification at each stage:
# After each pass, verify IR
$ mlir-opt --funlang-to-scf --verify-diagnostics input.mlir
$ mlir-opt --funlang-to-llvm --verify-diagnostics input.mlir
$ mlir-opt --convert-scf-to-cf --verify-diagnostics input.mlir
Common Errors and Debugging
Error 1: Block argument count mismatch
Symptom:
error: 'scf.yield' op result type mismatch
Cause:
Cons case region의 block arguments 개수가 틀림.
// Wrong: forgot to map tail argument
mapper.map(originalBlock->getArgument(0), headFunLang);
// Missing: mapper.map(originalBlock->getArgument(1), tail);
Fix:
모든 block arguments를 map해야 함.
mapper.map(originalBlock->getArgument(0), headFunLang);
mapper.map(originalBlock->getArgument(1), tail); // ✅
Error 2: Type mismatch after region cloning
Symptom:
error: 'func.call' op operand type mismatch: expected '!llvm.struct<...>', got '!funlang.list<...>'
Cause:
Region 내부의 operations가 아직 type conversion 안 됨.
Why:
FunLangToSCFPass는 partial conversion이다. Match operation만 lower하고 나머지 FunLang ops는 그대로 둔다.
Fix:
Region cloning 후 남은 FunLang operations는 다음 pass (FunLangToLLVMPass)에서 처리됨.
Temporary workaround: UnrealizedConversionCastOp 사용.
Value tail = extractTail(...); // Returns LLVM struct
// Cast back to FunLang type for func.call
Value tailFunLang = rewriter.create<UnrealizedConversionCastOp>(
loc, tailFunLangType, tail).getResult(0);
Error 3: Missing scf.yield in converted regions
Symptom:
error: block must terminate with scf.yield
Cause:
funlang.yield를 scf.yield로 변환하는 걸 까먹음.
// Wrong: just clone YieldOp as-is
for (Operation& op : originalBlock->getOperations()) {
rewriter.clone(op, mapper); // funlang.yield gets cloned!
}
Fix:
YieldOp를 특별히 처리해서 변환.
for (Operation& op : originalBlock->getOperations()) {
if (auto yieldOp = dyn_cast<YieldOp>(&op)) {
Value yieldValue = mapper.lookupOrDefault(yieldOp.getValue());
rewriter.create<scf::YieldOp>(loc, yieldValue); // ✅ Convert
} else {
rewriter.clone(op, mapper);
}
}
Error 4: Wrong tag values (0 vs 1 confusion)
Symptom:
런타임에 엉뚱한 case가 실행됨. 예: Nil list인데 Cons case 실행.
Cause:
Tag mapping이 틀림.
// Wrong: reversed mapping
// case 0 → Cons (wrong!)
// case 1 → Nil (wrong!)
Fix:
Chapter 18의 tag values와 일치시켜야 함:
// Correct mapping
// case 0 → Nil (tag = 0)
// case 1 → Cons (tag = 1)
for (auto [caseIndex, caseRegion] : llvm::enumerate(matchOp.getCases())) {
// caseIndex = 0 → Nil region (first in match)
// caseIndex = 1 → Cons region (second in match)
}
F# compiler는 pattern 순서를 보장해야 함:
// F# compiler must emit cases in this order:
// Case 0: Nil
// Case 1: Cons
match lst with
| [] -> ... // Must be first case
| head :: tail -> ... // Must be second case
Error 5: Incorrect data extraction from cons cell
Symptom:
런타임 segfault or garbage values.
Cause:
GEP indices 틀림.
// Wrong: GEP from struct pointer
Value tailPtr = builder.create<LLVM::GEPOp>(
loc, ptrType, structVal, ValueRange{one}); // ❌ structVal is value not pointer
Fix:
Data pointer는 이미 cons cell array를 가리킴.
// Correct: dataPtr already points to [head, tail] array
Value headPtr = dataPtr; // Points to head
Value head = builder.create<LLVM::LoadOp>(loc, headType, headPtr);
Value one = builder.create<arith::ConstantIntOp>(loc, 1, i32Type);
Value tailPtr = builder.create<LLVM::GEPOp>(
loc, ptrType, dataPtr, ValueRange{one}); // ✅ GEP from array pointer
Debugging strategies:
- Print IR after each pass:
$ mlir-opt --funlang-to-scf --print-ir-after-all input.mlir
- Use verifier:
$ mlir-opt --funlang-to-scf --verify-diagnostics input.mlir
- Dump operations in lowering code:
matchOp.dump(); // Before lowering
indexSwitchOp.dump(); // After lowering
- Check IRMapping:
for (auto [caseIndex, region] : enumerate(matchOp.getCases())) {
Block* block = ®ion.front();
llvm::errs() << "Case " << caseIndex << ":\n";
for (BlockArgument arg : block->getArguments()) {
llvm::errs() << " Arg: " << arg << " → "
<< mapper.lookupOrDefault(arg) << "\n";
}
}
리터럴 패턴 로우어링
지금까지 constructor patterns (Nil, Cons)를 위한 lowering을 설명했다. 이제 리터럴 패턴을 위한 lowering 전략을 다룬다.
Constructor vs Literal Dispatch
Constructor 패턴과 리터럴 패턴의 핵심 차이:
| 특성 | Constructor | Literal |
|---|---|---|
| 값의 범위 | 유한 (finite) | 무한 (infinite) |
| 테스트 | Tag extraction | Value comparison |
| MLIR dispatch | scf.index_switch | arith.cmpi + scf.if |
| Complexity | O(1) | O(n) sequential |
Constructor patterns use scf.index_switch:
Tag는 유한한 범위 (예: 0 = Nil, 1 = Cons)이므로 jump table이 가능하다.
// Constructor dispatch: O(1)
%tag = llvm.extractvalue %list[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %tag : i32 to index
%result = scf.index_switch %tag_index : index -> i32
case 0 { /* Nil case */ scf.yield ... }
case 1 { /* Cons case */ scf.yield ... }
default { scf.yield %unreachable : i32 }
Literal patterns use arith.cmpi + scf.if chain:
정수 리터럴은 무한하므로 순차적 비교가 필요하다.
// Literal dispatch: O(n)
%is_zero = arith.cmpi eq, %x, %c0 : i32
%result = scf.if %is_zero -> i32 {
scf.yield %zero_result : i32
} else {
%is_one = arith.cmpi eq, %x, %c1 : i32
%inner = scf.if %is_one -> i32 {
scf.yield %one_result : i32
} else {
scf.yield %default_result : i32
}
scf.yield %inner : i32
}
리터럴 패턴 Lowering 구현
리터럴 매칭을 위한 C++ lowering pattern:
// LiteralMatchLowering.cpp
class LiteralMatchOpLowering : public OpConversionPattern<LiteralMatchOp> {
public:
using OpConversionPattern<LiteralMatchOp>::OpConversionPattern;
LogicalResult matchAndRewrite(
LiteralMatchOp matchOp,
OpAdaptor adaptor,
ConversionPatternRewriter& rewriter) const override {
Location loc = matchOp.getLoc();
Value scrutinee = adaptor.getScrutinee();
Type resultType = matchOp.getResult().getType();
// Collect all cases: (literal_value, region)
auto cases = matchOp.getCases();
Region* defaultRegion = matchOp.getDefaultRegion();
// Build nested scf.if chain from bottom up
Value result = buildIfChain(rewriter, loc, scrutinee,
cases, defaultRegion, resultType);
rewriter.replaceOp(matchOp, result);
return success();
}
private:
Value buildIfChain(
ConversionPatternRewriter& rewriter,
Location loc,
Value scrutinee,
ArrayRef<std::pair<int64_t, Region*>> cases,
Region* defaultRegion,
Type resultType) const {
// Base case: no more cases, use default
if (cases.empty()) {
return cloneRegionAndGetResult(rewriter, loc, defaultRegion, resultType);
}
// Current case
auto [literalValue, caseRegion] = cases.front();
auto remainingCases = cases.drop_front();
// Create comparison: scrutinee == literal
Value literalConst = rewriter.create<arith::ConstantIntOp>(
loc, literalValue, scrutinee.getType());
Value isMatch = rewriter.create<arith::CmpIOp>(
loc, arith::CmpIPredicate::eq, scrutinee, literalConst);
// Create scf.if
auto ifOp = rewriter.create<scf::IfOp>(
loc, resultType, isMatch,
/*thenBuilder=*/[&](OpBuilder& b, Location loc) {
Value result = cloneRegionAndGetResult(b, loc, caseRegion, resultType);
b.create<scf::YieldOp>(loc, result);
},
/*elseBuilder=*/[&](OpBuilder& b, Location loc) {
Value result = buildIfChain(
rewriter, loc, scrutinee, remainingCases, defaultRegion, resultType);
b.create<scf::YieldOp>(loc, result);
});
return ifOp.getResult(0);
}
Value cloneRegionAndGetResult(
OpBuilder& builder,
Location loc,
Region* region,
Type resultType) const {
// Clone operations from region
IRMapping mapper;
for (Operation& op : region->front()) {
if (auto yieldOp = dyn_cast<YieldOp>(&op)) {
return mapper.lookupOrDefault(yieldOp.getValue());
} else {
builder.clone(op, mapper);
}
}
llvm_unreachable("Region must end with yield");
}
};
생성된 IR 예제:
// FunLang source
match x with
| 0 -> "zero"
| 1 -> "one"
| _ -> "other"
// After lowering: nested scf.if chain
%c0 = arith.constant 0 : i32
%c1 = arith.constant 1 : i32
%is_zero = arith.cmpi eq, %x, %c0 : i32
%result = scf.if %is_zero -> !llvm.ptr<i8> {
scf.yield %zero_str : !llvm.ptr<i8>
} else {
%is_one = arith.cmpi eq, %x, %c1 : i32
%inner = scf.if %is_one -> !llvm.ptr<i8> {
scf.yield %one_str : !llvm.ptr<i8>
} else {
// Default case: no comparison
scf.yield %other_str : !llvm.ptr<i8>
}
scf.yield %inner : !llvm.ptr<i8>
}
최적화 기회 (Optimization Opportunities)
1. Dense Range Detection
리터럴이 0, 1, 2, … 연속일 때 scf.index_switch로 변환 가능:
// Before: sequential comparisons
%is_0 = arith.cmpi eq, %x, %c0
scf.if %is_0 { ... } else {
%is_1 = arith.cmpi eq, %x, %c1
scf.if %is_1 { ... } else {
%is_2 = arith.cmpi eq, %x, %c2
// ...
}
}
// After: range check + index_switch
%in_range = arith.cmpi ult, %x, %c3 : i32
scf.if %in_range {
%idx = arith.index_cast %x : i32 to index
scf.index_switch %idx : index -> i32
case 0 { /* case 0 */ }
case 1 { /* case 1 */ }
case 2 { /* case 2 */ }
} else {
// default
}
Dense range detection algorithm:
bool isDenseRange(ArrayRef<int64_t> literals) {
if (literals.empty()) return false;
// Sort literals
SmallVector<int64_t> sorted(literals.begin(), literals.end());
llvm::sort(sorted);
// Check if consecutive
for (size_t i = 1; i < sorted.size(); ++i) {
if (sorted[i] != sorted[i-1] + 1)
return false;
}
// Starts from 0 or 1 (common case)
return sorted[0] == 0 || sorted[0] == 1;
}
2. Sparse Set Optimization
리터럴이 sparse할 때 (예: 0, 10, 100) binary search 가능:
// O(log n) with binary search
%mid = arith.constant 10 : i32
%less_than_mid = arith.cmpi slt, %x, %mid : i32
scf.if %less_than_mid {
// Check 0
%is_0 = arith.cmpi eq, %x, %c0
scf.if %is_0 { ... } else { /* default */ }
} else {
// Check 10, 100
%is_10 = arith.cmpi eq, %x, %c10
scf.if %is_10 { ... } else {
%is_100 = arith.cmpi eq, %x, %c100
scf.if %is_100 { ... } else { /* default */ }
}
}
이 최적화는 MLIR transformation pass로 구현 가능 (Phase 7).
3. LLVM Backend Optimization
SCF → CF → LLVM pipeline 후 LLVM backend가 추가 최적화:
; LLVM will recognize this pattern
%cmp0 = icmp eq i32 %x, 0
br i1 %cmp0, label %case0, label %check1
check1:
%cmp1 = icmp eq i32 %x, 1
br i1 %cmp1, label %case1, label %default
; And optimize to switch:
switch i32 %x, label %default [
i32 0, label %case0
i32 1, label %case1
]
LLVM switch lowering:
- Dense: jump table (O(1))
- Sparse: binary search tree (O(log n))
- Very sparse: linear search (O(n))
Mixed Patterns: Constructor + Literal
실제 코드는 constructor와 literal을 섞어 쓴다:
match (list, n) with
| (Nil, _) -> 0
| (Cons(x, _), 0) -> x
| (Cons(x, xs), n) -> x + process xs (n - 1)
Lowering 전략:
- First column (list): Constructor pattern →
scf.index_switch - Second column (n): Literal pattern →
arith.cmpi+scf.if
// Step 1: Constructor dispatch on list
%list_tag = llvm.extractvalue %list[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %list_tag : i32 to index
%result = scf.index_switch %tag_index : index -> i32
case 0 {
// Nil case: wildcard on n (no test)
%zero = arith.constant 0 : i32
scf.yield %zero : i32
}
case 1 {
// Cons case: extract data
%data = llvm.extractvalue %list[1] : !llvm.struct<(i32, ptr)>
%x = llvm.load %data : !llvm.ptr -> i32
// Step 2: Literal dispatch on n
%is_zero = arith.cmpi eq, %n, %c0 : i32
%inner = scf.if %is_zero -> i32 {
// Case: Cons(x, _), 0 → x
scf.yield %x : i32
} else {
// Case: Cons(x, xs), n → x + process xs (n-1)
%tail_ptr = llvm.getelementptr %data[1] : (!llvm.ptr) -> !llvm.ptr
%xs = llvm.load %tail_ptr : !llvm.ptr -> !llvm.struct<(i32, ptr)>
%n_minus_1 = arith.subi %n, %c1 : i32
%rest = func.call @process(%xs, %n_minus_1) : (...) -> i32
%sum = arith.addi %x, %rest : i32
scf.yield %sum : i32
}
scf.yield %inner : i32
}
핵심 원칙:
- Constructor column:
scf.index_switch로 O(1) dispatch - Literal column:
scf.ifchain으로 O(n) dispatch - Wildcard: test 없음 (fallthrough 또는 skip)
와일드카드 Default Case 처리
Wildcard (_) pattern의 lowering:
Wildcard는 어떤 테스트도 생성하지 않는다.
match x with
| 0 -> "zero"
| 1 -> "one"
| _ -> "other" // Wildcard: no test
%is_zero = arith.cmpi eq, %x, %c0 : i32
scf.if %is_zero {
scf.yield %zero_str : !llvm.ptr<i8>
} else {
%is_one = arith.cmpi eq, %x, %c1 : i32
scf.if %is_one {
scf.yield %one_str : !llvm.ptr<i8>
} else {
// _ case: NO comparison, just yield
scf.yield %other_str : !llvm.ptr<i8>
}
}
Wildcard optimization in subpatterns:
match list with
| Cons(_, tail) -> length tail + 1 // Don't extract head
| Nil -> 0
case 1 { // Cons
// Wildcard _: Skip head extraction
// %head = llvm.load %data -- NOT generated!
%tail_ptr = llvm.getelementptr %data[1] : (!llvm.ptr) -> !llvm.ptr
%tail = llvm.load %tail_ptr : !llvm.ptr -> !llvm.struct<(i32, ptr)>
// ...
}
Wildcard 최적화의 효과:
- 메모리 접근 감소: 불필요한 load 제거
- 레지스터 절약: unused 값을 저장 안 함
- DCE 촉진: Dead code elimination이 더 쉬워짐
Type Dispatch Pattern
타입 기반 dispatch (future extension):
일부 언어는 runtime type으로 dispatch한다:
// Hypothetical type dispatch
match value with
| :? int as n -> n + 1
| :? string as s -> String.length s
| _ -> 0
이는 다음으로 lowering 가능:
// Type tag dispatch (similar to ADT constructor)
%type_tag = llvm.extractvalue %boxed_value[0] : !llvm.struct<(i32, ptr)>
%tag_index = arith.index_cast %type_tag : i32 to index
scf.index_switch %tag_index : index -> i32
case 0 { /* int case */ }
case 1 { /* string case */ }
default { scf.yield %zero : i32 }
현재 FunLang은 ADT constructor만 지원하지만, 동일한 패턴이 적용된다.
Summary and Chapter 20 Preview
Chapter 19 Recap
이 장에서 배운 것:
✅ Part 1: Match Operation Definition
-
Region-based operations
- Regions vs basic blocks: encapsulation, verification 장점
funlang.match는 multiple regions (variadic, each with 1 block)
-
Match operation semantics
- Runtime execution: tag extraction → case selection → variable binding → yield
- Block arguments for pattern variables
-
TableGen definition
- Traits:
RecursiveSideEffect,SingleBlockImplicitTerminator<"YieldOp"> VariadicRegion<SizedRegion<1>>for flexible case count- Custom assembly format, verifier
- Traits:
-
YieldOp terminator
- Terminator trait, HasParent<“MatchOp”> constraint
- Dedicated operation (not reusing scf.yield)
-
C API and F# integration
- Builder callback pattern for region construction
- High-level wrapper:
CreateMatchOp(scrutinee, resultType, buildCases) - Block arguments added in F# callback, mapped in lowering pass
✅ Part 2: SCF Lowering and Pipeline
-
SCF dialect overview
- Structured control flow (regions, not goto)
scf.index_switchfor multi-way branching- Why SCF before LLVM: structure preservation, optimization, debugging
-
MatchOpLowering pattern
- Tag extraction and index casting
- Data extraction for pattern variables
- IRMapping for block argument remapping
- Region cloning with mapped values
funlang.yield→scf.yieldconversion
-
Complete pipeline
- FunLangToSCFPass → FunLangToLLVMPass → SCFToControlFlowPass → …
- Pass dependencies and ordering
- End-to-end example: sum_list
-
Common errors
- Block argument count mismatch
- Type mismatch in regions
- Missing scf.yield
- Wrong tag values
- Incorrect data extraction
Pattern Matching Pipeline: Complete
Phase 6 journey:
Chapter 17: Theory
↓
Decision tree algorithm
Pattern matrix, specialization/defaulting
Exhaustiveness checking
Chapter 18: Data Structures
↓
!funlang.list<T> type
funlang.nil, funlang.cons operations
TypeConverter, lowering patterns
Chapter 19: Match Compilation (현재)
↓
funlang.match operation
Region-based structure
MatchOpLowering to scf.index_switch
Complete pipeline
Chapter 20: Functional Programs (next)
↓
Realistic examples: map, filter, fold
Performance analysis
Debugging functional code
지금까지의 성과:
| Feature | Chapters | Operations | Status |
|---|---|---|---|
| Arithmetic | 5-6 | arith.* | ✅ Phase 2 |
| Let bindings | 7 | SSA values | ✅ Phase 2 |
| Control flow | 8 | scf.if | ✅ Phase 2 |
| Functions | 10 | func.func, func.call | ✅ Phase 3 |
| Recursion | 11 | func.call @self | ✅ Phase 3 |
| Closures | 12 | funlang.closure | ✅ Phase 5 |
| Higher-order | 13 | funlang.apply | ✅ Phase 5 |
| Custom dialect | 14-16 | Lowering passes | ✅ Phase 5 |
| Pattern matching | 17-19 | funlang.match | ✅ Phase 6 (현재) |
| Data structures | 17-19 | funlang.nil, funlang.cons | ✅ Phase 6 (현재) |
다음: Realistic functional programs
Chapter 20 Preview: Functional Programs
Chapter 20에서 할 것:
-
Classic list functions
length,map,filter,fold_left,fold_right- Pattern matching + recursion 결합
-
Composed functions
sum = fold_left (+) 0product = fold_left (*) 1- Higher-order functions로 추상화
-
Performance analysis
- Tail recursion vs non-tail recursion
- Closure allocation overhead
- GC pressure measurement
-
Debugging techniques
- IR dumping at each stage
- printf debugging in functional code
- Stack trace interpretation
-
Complete FunLang compiler
- All features integrated
- End-to-end compilation
- Real-world program examples
Chapter 20 목표:
지금까지 배운 모든 기능을 종합하여 실용적인 함수형 프로그램을 작성하고 컴파일한다.
Example program (Chapter 20):
// Functional list library
let rec map f lst =
match lst with
| [] -> []
| head :: tail -> f head :: map f tail
let rec filter pred lst =
match lst with
| [] -> []
| head :: tail ->
if pred head then
head :: filter pred tail
else
filter pred tail
let rec fold_left f acc lst =
match lst with
| [] -> acc
| head :: tail -> fold_left f (f acc head) tail
// Usage
let double x = x * 2
let is_even x = x % 2 = 0
let main () =
let numbers = [1; 2; 3; 4; 5; 6]
let doubled = map double numbers // [2; 4; 6; 8; 10; 12]
let evens = filter is_even doubled // [2; 4; 6; 8; 10; 12]
let sum = fold_left (+) 0 evens // 42
sum
Generated MLIR (high-level view):
module {
func.func @map(%f: !funlang.closure, %lst: !funlang.list<i32>)
-> !funlang.list<i32> {
%result = funlang.match %lst {
^nil: ...
^cons(%h, %t): ...
}
return %result
}
func.func @filter(%pred: !funlang.closure, %lst: !funlang.list<i32>)
-> !funlang.list<i32> { ... }
func.func @fold_left(%f: !funlang.closure, %acc: i32, %lst: !funlang.list<i32>)
-> i32 { ... }
func.func @main() -> i32 {
// Build list [1; 2; 3; 4; 5; 6]
%numbers = ...
// map double numbers
%double = funlang.closure @double, () : !funlang.closure
%doubled = func.call @map(%double, %numbers) : ...
// filter is_even doubled
%is_even = funlang.closure @is_even, () : !funlang.closure
%evens = func.call @filter(%is_even, %doubled) : ...
// fold_left (+) 0 evens
%plus = funlang.closure @plus, () : !funlang.closure
%zero = arith.constant 0 : i32
%sum = func.call @fold_left(%plus, %zero, %evens) : ...
return %sum : i32
}
}
Chapter 20 will show:
- Complete compilation to native code
- Performance benchmarks
- Comparison with imperative equivalents
- Debugging workflow for functional programs
튜플 패턴 매칭 (Tuple Pattern Matching)
Chapter 18에서 우리는 !funlang.tuple<T1, T2, ...> 타입과 funlang.make_tuple 연산을 구현했다. 이제 튜플에 대한 패턴 매칭을 구현하자.
튜플 패턴의 특성 (Tuple Pattern Characteristics)
튜플 패턴은 리스트 패턴과 근본적으로 다르다:
| 특성 | 리스트 패턴 | 튜플 패턴 |
|---|---|---|
| 태그 검사 | 필요 (Nil/Cons 구분) | 불필요 |
| 패턴 case 수 | 최소 2개 (Nil, Cons) | 항상 1개 |
| 매칭 실패 가능성 | 있음 | 없음 (항상 매칭) |
| Lowering 대상 | scf.index_switch | 직접 extractvalue |
| 제어 흐름 | 조건부 분기 | 선형 |
핵심 통찰:
튜플 패턴은 본질적으로 **구조 분해(destructuring)**다. 항상 매칭이 성공하므로 조건 분기가 필요 없다.
// 리스트: 두 가지 가능성, 조건 분기 필요
match list with
| [] -> expr1 // Nil case
| x :: xs -> expr2 // Cons case
// 튜플: 한 가지 가능성, 조건 분기 불필요
match pair with
| (x, y) -> x + y // 항상 이 case로
funlang.match 튜플 지원 (Tuple Support in funlang.match)
튜플 패턴 매칭의 MLIR 표현:
// 소스 코드: let (x, y) = pair in x + y
%pair = funlang.make_tuple(%a, %b) : !funlang.tuple<i32, i32>
%sum = funlang.match %pair : !funlang.tuple<i32, i32> -> i32 {
^case(%x: i32, %y: i32):
%result = arith.addi %x, %y : i32
funlang.yield %result : i32
}
리스트 패턴과 비교:
// 리스트 패턴: 두 case
%result = funlang.match %list : !funlang.list<i32> -> i32 {
^nil:
%zero = arith.constant 0 : i32
funlang.yield %zero : i32
^cons(%head: i32, %tail: !funlang.list<i32>):
// ...
funlang.yield %sum : i32
}
// 튜플 패턴: 한 case만
%result = funlang.match %tuple : !funlang.tuple<i32, i32> -> i32 {
^case(%x: i32, %y: i32):
// 항상 이 case 실행
funlang.yield %result : i32
}
튜플 패턴의 핵심:
- 단일 case: 분기 불필요
- block arguments = 구조 분해된 원소:
(%x, %y)→^case(%x: i32, %y: i32) - 원소 개수 = block argument 개수: 타입의 arity와 일치
튜플 로우어링 구현 (Tuple Lowering Implementation)
핵심 차이점:
리스트 패턴 lowering:
- 태그 추출 (extractvalue [0])
- scf.index_switch로 분기
- 각 case에서 데이터 추출
튜플 패턴 lowering:
- 각 원소 추출 (extractvalue [i])
- 원래 block의 operation들을 inline
- 분기 없음!
TupleMatchLowering 패턴:
// 튜플 패턴 매칭의 lowering은 특별 처리가 필요
// MatchOpLowering 내부에서 튜플 타입 감지 시
LogicalResult matchTuplePattern(MatchOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) {
Location loc = op.getLoc();
Value input = adaptor.getInput();
// 튜플은 단일 case만 가짐
assert(op.getCases().size() == 1 && "Tuple match must have exactly one case");
Region& caseRegion = op.getCases().front();
Block& caseBlock = caseRegion.front();
// 구조체에서 각 원소 추출
auto structType = input.getType().cast<LLVM::LLVMStructType>();
IRMapping mapper;
for (size_t i = 0; i < caseBlock.getNumArguments(); ++i) {
Value extracted = rewriter.create<LLVM::ExtractValueOp>(
loc, structType.getBody()[i], input, i);
mapper.map(caseBlock.getArgument(i), extracted);
}
// 원래 block의 operations를 현재 위치에 inline
for (Operation& op : caseBlock.getOperations()) {
if (auto yieldOp = dyn_cast<YieldOp>(&op)) {
// yield의 값으로 match 결과를 대체
Value yieldValue = mapper.lookupOrDefault(yieldOp.getValue());
rewriter.replaceOp(op, yieldValue);
} else {
rewriter.clone(op, mapper);
}
}
return success();
}
Lowering 결과 비교:
// Before lowering (FunLang)
%sum = funlang.match %pair : !funlang.tuple<i32, i32> -> i32 {
^case(%x: i32, %y: i32):
%result = arith.addi %x, %y : i32
funlang.yield %result : i32
}
// After lowering (LLVM dialect) - 분기 없음!
%x = llvm.extractvalue %pair[0] : !llvm.struct<(i32, i32)>
%y = llvm.extractvalue %pair[1] : !llvm.struct<(i32, i32)>
%sum = arith.addi %x, %y : i32
리스트 패턴 lowering과 비교:
// 리스트: scf.index_switch 필요
%tag = llvm.extractvalue %list[0] : !llvm.struct<(i32, ptr)>
%tagIndex = arith.index_cast %tag : i32 to index
%result = scf.index_switch %tagIndex : index -> i32
case 0 { // Nil
%zero = arith.constant 0 : i32
scf.yield %zero : i32
}
case 1 { // Cons
%ptr = llvm.extractvalue %list[1] : !llvm.struct<(i32, ptr)>
%head = llvm.load %ptr : !llvm.ptr -> i32
// ...
scf.yield %sum : i32
}
default {
llvm.unreachable
}
// 튜플: extractvalue만으로 충분
%x = llvm.extractvalue %pair[0] : !llvm.struct<(i32, i32)>
%y = llvm.extractvalue %pair[1] : !llvm.struct<(i32, i32)>
%result = arith.addi %x, %y : i32
중첩 패턴 (Nested Patterns)
튜플 안에 리스트가 있는 경우:
// 두 리스트를 튜플로 묶어서 동시에 패턴 매칭
let rec zip xs ys =
match (xs, ys) with
| ([], _) -> []
| (_, []) -> []
| (x :: xs', y :: ys') -> (x, y) :: zip xs' ys'
MLIR 표현:
// 1단계: 튜플 구조 분해
%tuple = funlang.make_tuple(%xs, %ys) : !funlang.tuple<!funlang.list<i32>, !funlang.list<i32>>
// 2단계: 튜플에서 두 리스트 추출
%xs_extracted = ... extractvalue [0] ...
%ys_extracted = ... extractvalue [1] ...
// 3단계: 첫 번째 리스트에 대해 패턴 매칭
%result = funlang.match %xs_extracted : !funlang.list<i32> -> ... {
^nil:
// 빈 리스트 반환
^cons(%x: i32, %xs_tail: !funlang.list<i32>):
// 4단계: 두 번째 리스트에 대해 중첩 패턴 매칭
%inner = funlang.match %ys_extracted : !funlang.list<i32> -> ... {
^nil:
// 빈 리스트 반환
^cons(%y: i32, %ys_tail: !funlang.list<i32>):
// (x, y) :: zip xs_tail ys_tail
%pair = funlang.make_tuple(%x, %y) : !funlang.tuple<i32, i32>
%rest = func.call @zip(%xs_tail, %ys_tail) : ...
%result = funlang.cons %pair, %rest : ...
funlang.yield %result
}
funlang.yield %inner
}
중첩 패턴 lowering 전략:
- 외부에서 내부로: 가장 바깥 패턴부터 lowering
- 튜플 먼저: 튜플 분해는 조건 없이 extractvalue
- 리스트는 분기: 각 리스트 패턴은 scf.index_switch 필요
- 깊이 우선: 내부 패턴이 완전히 lowering된 후 외부로
MatchOpLowering 확장: 튜플 지원
기존 MatchOpLowering에 튜플 분기 추가:
class MatchOpLowering : public OpConversionPattern<MatchOp> {
public:
LogicalResult matchAndRewrite(MatchOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {
Location loc = op.getLoc();
Value input = adaptor.getInput();
Type inputType = op.getInput().getType();
// 튜플인지 확인
if (auto tupleType = inputType.dyn_cast<funlang::TupleType>()) {
return matchTuplePattern(op, adaptor, rewriter, tupleType);
}
// 리스트인 경우 기존 로직 사용
if (auto listType = inputType.dyn_cast<funlang::ListType>()) {
return matchListPattern(op, adaptor, rewriter, listType);
}
return op.emitError("unsupported match input type");
}
private:
LogicalResult matchTuplePattern(MatchOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter,
funlang::TupleType tupleType) const {
Location loc = op.getLoc();
Value input = adaptor.getInput();
// 튜플은 단일 case만 허용
if (op.getCases().size() != 1) {
return op.emitError("tuple match must have exactly one case");
}
Region& caseRegion = op.getCases().front();
Block& caseBlock = caseRegion.front();
// block argument 개수 검증
if (caseBlock.getNumArguments() != tupleType.getNumElements()) {
return op.emitError() << "tuple arity mismatch: type has "
<< tupleType.getNumElements() << " elements but pattern has "
<< caseBlock.getNumArguments();
}
// 각 원소 추출 및 매핑
auto structType = getTypeConverter()->convertType(tupleType);
IRMapping mapper;
for (size_t i = 0; i < caseBlock.getNumArguments(); ++i) {
auto elemType = structType.cast<LLVM::LLVMStructType>().getBody()[i];
Value extracted = rewriter.create<LLVM::ExtractValueOp>(
loc, elemType, input, i);
mapper.map(caseBlock.getArgument(i), extracted);
}
// 현재 위치에 operations inline
Value resultValue;
for (Operation& caseOp : caseBlock.getOperations()) {
if (auto yieldOp = dyn_cast<YieldOp>(&caseOp)) {
resultValue = mapper.lookupOrDefault(yieldOp.getValue());
} else {
rewriter.clone(caseOp, mapper);
}
}
rewriter.replaceOp(op, resultValue);
return success();
}
LogicalResult matchListPattern(MatchOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter,
funlang::ListType listType) const {
// 기존 리스트 패턴 매칭 로직...
// (scf.index_switch 사용)
}
};
리스트 vs 튜플 패턴 매칭 종합 비교
| 구분 | 리스트 패턴 | 튜플 패턴 |
|---|---|---|
| 타입 | !funlang.list<T> | !funlang.tuple<T1, T2, ...> |
| case 수 | 2개 이상 (Nil, Cons, …) | 정확히 1개 |
| 태그 검사 | 필요 (extractvalue [0]) | 불필요 |
| 분기 | scf.index_switch | 없음 |
| 데이터 추출 | ptr load 필요 | extractvalue만 |
| 패턴 변수 바인딩 | 조건부 (case 안에서) | 무조건 |
| default case | 있음 (unreachable) | 없음 |
| lowering 복잡도 | 높음 | 낮음 |
| 최종 코드 크기 | 크다 (분기 포함) | 작다 (선형) |
생성되는 코드 비교:
// 리스트 패턴 매칭 결과 (복잡)
%tag = llvm.extractvalue %list[0] : !llvm.struct<(i32, ptr)>
%idx = arith.index_cast %tag : i32 to index
%result = scf.index_switch %idx : index -> i32
case 0 { // ~10 lines
...
scf.yield %val0 : i32
}
case 1 { // ~15 lines
...
scf.yield %val1 : i32
}
default { // ~3 lines
llvm.unreachable
}
// 튜플 패턴 매칭 결과 (단순)
%x = llvm.extractvalue %pair[0] : !llvm.struct<(i32, i32)>
%y = llvm.extractvalue %pair[1] : !llvm.struct<(i32, i32)>
// ... operations inline ...
튜플 패턴의 최적화 기회
튜플 패턴 매칭은 이미 최적화되어 있다:
- 분기 제거: 조건문 없이 바로 연산
- Inlining 자동: 별도 함수 호출 없음
- 레지스터 친화적: 작은 튜플은 레지스터에 유지
추가 최적화 가능:
// Before: 사용하지 않는 원소도 추출
%pair = funlang.make_tuple(%a, %b) : !funlang.tuple<i32, i32>
%result = funlang.match %pair {
^case(%x: i32, %y: i32): // %y 사용 안 함
funlang.yield %x : i32
}
// After: Dead code elimination으로 %y 추출 제거
%x = llvm.extractvalue %pair[0] : !llvm.struct<(i32, i32)>
// %y 추출 생략됨
%result = %x
와일드카드 패턴 (Wildcard Pattern)
사용하지 않는 원소는 와일드카드로:
// 첫 번째 원소만 사용
let fst pair = match pair with (x, _) -> x
// 두 번째 원소만 사용
let snd pair = match pair with (_, y) -> y
MLIR에서 와일드카드:
// 와일드카드는 block argument가 없음
%fst = funlang.match %pair : !funlang.tuple<i32, i32> -> i32 {
^case(%x: i32): // %y 자리에 block argument 없음 (또는 unused)
funlang.yield %x : i32
}
Lowering 시 최적화:
// 와일드카드 패턴 처리
for (size_t i = 0; i < tupleType.getNumElements(); ++i) {
BlockArgument arg = caseBlock.getArgument(i);
if (!arg.use_empty()) { // 사용되는 경우에만 추출
Value extracted = rewriter.create<LLVM::ExtractValueOp>(...);
mapper.map(arg, extracted);
}
// 와일드카드 (미사용)면 extractvalue 생략
}
Summary: 튜플 패턴 매칭
구현 완료:
- 튜플 패턴의 특성 이해 (단일 case, 분기 없음)
- funlang.match에서 튜플 타입 지원
- MatchOpLowering에서 튜플/리스트 분기 처리
- extractvalue 체인으로 원소 추출
- 분기 없이 inline lowering
- 중첩 패턴 (튜플 + 리스트) 처리 전략
- 와일드카드 패턴과 dead code elimination
튜플 패턴 vs 리스트 패턴:
| 측면 | 리스트 | 튜플 |
|---|---|---|
| 패턴 case | 다중 | 단일 |
| 제어 흐름 | scf.index_switch | 선형 |
| 태그 검사 | 필요 | 불필요 |
| Lowering | 복잡 | 단순 |
| 생성 코드 | 분기 포함 | extractvalue만 |
다음:
- Chapter 20에서 튜플을 활용한 실제 프로그램 (zip, fst/snd)
- 중첩 튜플과 포인트 예제
Conclusion
Chapter 19 완료!
우리는 funlang.match operation을 정의하고 SCF dialect로 lowering하여 패턴 매칭 컴파일 파이프라인을 완성했다.
핵심 개념:
- Region-based operations: Encapsulation과 verification을 위한 구조
- Multi-stage lowering: FunLang → SCF → CF → LLVM (progressive refinement)
- IRMapping: Block arguments를 실제 values로 remapping
- Builder callback pattern: F#에서 regions를 구축하는 방법
- 튜플 패턴 매칭: 분기 없는 extractvalue 기반 lowering
Phase 6 진행 상황:
- ✅ Chapter 17: Pattern matching theory (Decision tree algorithm)
- ✅ Chapter 18: List operations (funlang.nil, funlang.cons, funlang.tuple, funlang.make_tuple)
- ✅ Chapter 19: Match compilation (funlang.match, 리스트/튜플 패턴 lowering)
- ⏭️ Chapter 20: Functional programs (map, filter, fold, zip - realistic examples)
다음 장에서 만나요!