Language
0. Conventions
MUST / MUST NOT / SHOULD use RFC-style normative meaning.
1. Overview
Hachi is a statically typed, imperative, C++-transpiling language with:
• colon-driven syntax (: for assignment + call)
• :: for definitions / constant functions
• left–right argument model (Li/Ri)
• optional type inference
• auto-free semantics for Hachi objects
• escape hatches to raw C++ via innerCPP and outerCPP.
The compiler outputs a self-contained C++ file and optionally builds it via the system compiler.
2. Lexical Structure
2.1 Encoding
UTF-8 only.
2.2 Whitespace
Hachi does not use indentation for structure. Newlines end statements unless inside parentheses or braces.
2.3 Comments
Single-line
# like Python
Multiline
//
...
\\
// opens; a backslash on its own line closes.
2.4 Identifiers
• [A-Za-z_][A-Za-z0-9_]*
• Case-sensitive.
• Keywords cannot be redefined.
2.5 Literals
• Integer: [0-9][0-9_]*
• Float: [0-9]*\.[0-9]+ with optional exponent
• Bool: tru, fls
• String: "..." with \" and \\ escapes
• No char literal type (chars are 1-length Strings).
3. Type System
3.1 Primitive Types
| Type | Notes |
|---|---|
Int | 64-bit signed integer (C++ long long) |
Flt | double precision |
Bool | tru/fls |
Byte | unsigned 8-bit |
String | Hachi-managed heap string with auto-free |
Void | no return |
AnyT | runtime-tagged (variant) box |
3.2 Structs and Tuples
Named struct
Point :: {
x: Int
y: Int
}
Positional struct (tuple-ish)
Pair :: {Int, String}
Access rules (implementation-grounded)
• Named fields use dot: p.x.
• Positional fields become autogenerated identifiers: .a, .b, .c (in order).
3.3 Arrays
IntArray: <size>
StringArray: <size>
Operations:
• .len – returns element count
• .get: idx – bounds-checked
• .set: idx, val – bounds-checked
4. Names and Bindings
4.1 Variables (:)
The colon is assignment if the LHS is not a function name, otherwise it is a call.
x: 3
x: x + 1
The variable is created implicitly.
4.2 Constants and Constant Functions (::)
:: defines a constant or a function.
Constant expressions MUST be compile-time evaluable.
Function example:
add :: {a:Int, b:Int} -> {Int} : (
Ri.a + Ri.b
)
Constant example:
MAX :: 10
4.3 Import (>@)
>@ "fmt",
>@ "so"
Resolution order:
- HACHI_LIB (env var)
- Current file directory
- Compiler’s built-in lib folder
No .hachi suffix is included.
5. Expressions and Function Call Semantics
5.1 Colon is unified call/assign operator
The parser uses contextual rules:
name: expr
If name resolves to:
• variable → assignment
• function → call
5.2 Arguments
Any delimiter inside the argument list is accepted:
• comma
• space
• semicolon
• newline
All are normalized by the parser.
5.3 Left / Right Inputs (Li, Ri)
Mechanics are from ExprCall.cpp.
Left-side call:
leftExpr.func: rightExpr
Inside the function:
• Li holds the left struct
• Ri holds the right struct
Named and unnamed fields work.
5.4 Dot operator
Used for:
• member selection
• chaining into function call (method syntax is syntactic sugar)
Examples:
obj.field
obj.method: 3
5.5 Return value
The last expression inside (...) of a function is the return value.
6. Functions
6.1 Declaration
name :: {params} -> {rets} : ( body )
or with left/right:
dot :: {Int}.{Int} -> {Int} : (
Li * Ri
)
6.2 Anonymous functions
Lambdas are supported:
({}:(print:"string))
The body is parsed as a standalone expression block.
6.3 Overloading
Real overloading: functions share a name if their left/right signature shapes differ.
7. Control Flow
7.1 Conditionals (?)
cond ? (then) | (elif) | (else)
Single-expression bodies MAY omit parentheses.
7.2 Loops (@)
While-style
cond @ ( body )
C-style for
init | cond | incr @ ( body )
All expressions are evaluated with normal expression semantics.
8. Operators (Real Precedence)
-
. -
unary
! -
* / % -
+ - -
comparisons
= != < > <= >= -
logical
&& -
logical
|| -
conditional separators
?and| -
:/::(statement-level)
Notes:
• = is equality, NOT assignment
• assignment is colon
• ^, bitwise, shifts, etc., are reserved tokens but unimplemented
• No ternary operator besides Hachi’s own ? | |
9. Built-ins (Actual runtime registry)
Reflecting HachiStbLib.cpp:
9.1 I/O
• print: v
• println: v
• input
• inputInt
• exit
• pass
9.2 Conversions
Available as both left- and right-forms:
Int: "123"
99.String
Flt: "2.5"
9.3 Strings
• len
• sub
• at
• toascii
9.4 Arrays
See §3.3.
9.5 Shell (hcmd)
hcmd: "ls -al"
Returns a String.
9.6 C++ Interop
Actual keywords:
outerCPP:"<...>"
innerCPP:"<...>"
outerCPP is placed before main,
innerCPP is injected inside main.
10. Modules
Modules are plain .hachi files whose functions/consts become available after import.
Examples from your repo:
• fmt
• so
• json
• fs
• sys
• net (if present)
11. CLI & REPL
11.1 REPL Commands
.clear, .clr
.read, .r
.exit, .x
11.2 CLI flags
• -cpp <file>
• -b / -build <out>
• -buildml <out>
• -go
• -cf "<flags>" forward to C++ compiler
• -cc "<flags>" additional builder flags
• -c "<code>" run inline snippet
• -v, -h, -d
12. Runtime Semantics
• Evaluation order is strictly left-to-right in expressions.
• All Hachi objects are auto-freed; C++ objects injected via innerCPP are not.
• Type mismatches are runtime or compile errors depending on phase.
• AnyT introduces dynamic dispatch with runtime type tag.
• Arrays enforce bounds checking.
• Strings always store size + bytes and are heap-owned.
13. Grammar
program := { statement } ;
statement := constDecl
| assignment
| funcDecl
| expr
;
assignment := IDENT ':' expr ;
constDecl := IDENT '::' expr
| funcDecl ;
funcDecl := IDENT ('::' | ':') funcSig ':' '(' body ')' ;
funcSig := leftSig ('.' rightSig)? ('->' returnSig)? ;
leftSig := '{' params? '}' ;
rightSig := '{' params? '}' ;
returnSig := '{' params? '}' ;
params := param ( delim param )* ;
param := IDENT ':' TYPE
| TYPE ;
delim := ',' | ';' | NEWLINE | WHITESPACE ;
expr := condExpr ;
condExpr := orExpr ('?' block ('|' block)*)? ;
orExpr := andExpr ( '||' andExpr )* ;
andExpr := compareExpr ( '&&' compareExpr )* ;
compareExpr := addExpr ( ( '=' | '!=' | '<' | '>' | '<=' | '>=' ) addExpr )* ;
addExpr := multExpr ( ('+' | '-') multExpr )* ;
multExpr := unaryExpr ( ('*' | '/' | '%') unaryExpr )* ;
unaryExpr := '!' unaryExpr
| primary ;
primary := literal
| IDENT
| primary '.' IDENT
| IDENT ':' arglist
| '(' expr ')'
;
arglist := expr ( delim expr )* ;
14. Known Divergences From Docs
• Flt is the real float type (not Dub).
• Array .get / .set are bounds-checked.
• .a, .b, .c auto-fields exist for positional structs.
• Member access has the highest operator precedence.
• Some experimental operators (^, &, bitwise) are tokenized but not implemented.
• Default string indexing is byte-based, not Unicode code-point aware.
15. Conformance Criteria
A conformance suite MUST validate:
• colon disambiguation (assign vs call)
• left/right application rules
• struct access rules (.a, .b, .c)
• overloading dispatch correctness
• conditionals with multiple | branches
• for-loop header handling
• arrays bounds safety
• correct C++ generation for calls, struct creation, lambdas
• correctness of auto-free semantics on all string/array paths