Language

0. Conventions

MUST / MUST NOT / SHOULD use RFC-style normative meaning.


1. Overview

Hachi is a statically typed, imperative, C++-transpiling language with:

• colon-driven syntax (: for assignment + call)
:: for definitions / constant functions
• left–right argument model (Li/Ri)
• optional type inference
• auto-free semantics for Hachi objects
• escape hatches to raw C++ via innerCPP and outerCPP.

The compiler outputs a self-contained C++ file and optionally builds it via the system compiler.


2. Lexical Structure

2.1 Encoding

UTF-8 only.

2.2 Whitespace

Hachi does not use indentation for structure. Newlines end statements unless inside parentheses or braces.

2.3 Comments

Single-line

# like Python

Multiline

// 
 ...
\\

// opens; a backslash on its own line closes.

2.4 Identifiers

[A-Za-z_][A-Za-z0-9_]*
• Case-sensitive.
• Keywords cannot be redefined.

2.5 Literals

• Integer: [0-9][0-9_]*
• Float: [0-9]*\.[0-9]+ with optional exponent
• Bool: tru, fls
• String: "..." with \" and \\ escapes
• No char literal type (chars are 1-length Strings).


3. Type System

3.1 Primitive Types

TypeNotes
Int64-bit signed integer (C++ long long)
Fltdouble precision
Booltru/fls
Byteunsigned 8-bit
StringHachi-managed heap string with auto-free
Voidno return
AnyTruntime-tagged (variant) box

3.2 Structs and Tuples

Named struct

Point :: {
    x: Int
    y: Int
}

Positional struct (tuple-ish)

Pair :: {Int, String}

Access rules (implementation-grounded)

• Named fields use dot: p.x.
• Positional fields become autogenerated identifiers: .a, .b, .c (in order).

3.3 Arrays

IntArray: <size>
StringArray: <size>

Operations:

.len – returns element count
.get: idx – bounds-checked
.set: idx, val – bounds-checked


4. Names and Bindings

4.1 Variables (:)

The colon is assignment if the LHS is not a function name, otherwise it is a call.

x: 3
x: x + 1

The variable is created implicitly.

4.2 Constants and Constant Functions (::)

:: defines a constant or a function.
Constant expressions MUST be compile-time evaluable.

Function example:

add :: {a:Int, b:Int} -> {Int} : (
    Ri.a + Ri.b
)

Constant example:

MAX :: 10

4.3 Import (>@)

>@ "fmt",
>@ "so"

Resolution order:

  1. HACHI_LIB (env var)
  2. Current file directory
  3. Compiler’s built-in lib folder

No .hachi suffix is included.


5. Expressions and Function Call Semantics

5.1 Colon is unified call/assign operator

The parser uses contextual rules:

name: expr

If name resolves to:

variable → assignment
function → call

5.2 Arguments

Any delimiter inside the argument list is accepted:

• comma
• space
• semicolon
• newline

All are normalized by the parser.

5.3 Left / Right Inputs (Li, Ri)

Mechanics are from ExprCall.cpp.

Left-side call:

leftExpr.func: rightExpr

Inside the function:

Li holds the left struct
Ri holds the right struct

Named and unnamed fields work.

5.4 Dot operator

Used for:

• member selection
• chaining into function call (method syntax is syntactic sugar)

Examples:

obj.field
obj.method: 3

5.5 Return value

The last expression inside (...) of a function is the return value.


6. Functions

6.1 Declaration

name :: {params} -> {rets} : ( body )

or with left/right:

dot :: {Int}.{Int} -> {Int} : (
    Li * Ri
)

6.2 Anonymous functions

Lambdas are supported:

({}:(print:"string))

The body is parsed as a standalone expression block.

6.3 Overloading

Real overloading: functions share a name if their left/right signature shapes differ.


7. Control Flow

7.1 Conditionals (?)

cond ? (then) | (elif) | (else)

Single-expression bodies MAY omit parentheses.

7.2 Loops (@)

While-style

cond @ ( body )

C-style for

init | cond | incr @ ( body )

All expressions are evaluated with normal expression semantics.


8. Operators (Real Precedence)

  1. .

  2. unary !

  3. * / %

  4. + -

  5. comparisons = != < > <= >=

  6. logical &&

  7. logical ||

  8. conditional separators ? and |

  9. : / :: (statement-level)

Notes:

= is equality, NOT assignment
• assignment is colon
^, bitwise, shifts, etc., are reserved tokens but unimplemented
• No ternary operator besides Hachi’s own ? | |


9. Built-ins (Actual runtime registry)

Reflecting HachiStbLib.cpp:

9.1 I/O

print: v
println: v
input
inputInt
exit
pass

9.2 Conversions

Available as both left- and right-forms:

Int: "123"
99.String
Flt: "2.5"

9.3 Strings

len
sub
at
toascii

9.4 Arrays

See §3.3.

9.5 Shell (hcmd)

hcmd: "ls -al"

Returns a String.

9.6 C++ Interop

Actual keywords:

outerCPP:"<...>"
innerCPP:"<...>"

outerCPP is placed before main,
innerCPP is injected inside main.


10. Modules

Modules are plain .hachi files whose functions/consts become available after import.

Examples from your repo:

fmt
so
json
fs
sys
net (if present)


11. CLI & REPL

11.1 REPL Commands

.clear, .clr
.read, .r
.exit, .x

11.2 CLI flags

-cpp <file>
-b / -build <out>
-buildml <out>
-go
-cf "<flags>" forward to C++ compiler
-cc "<flags>" additional builder flags
-c "<code>" run inline snippet
-v, -h, -d


12. Runtime Semantics

Evaluation order is strictly left-to-right in expressions.
• All Hachi objects are auto-freed; C++ objects injected via innerCPP are not.
• Type mismatches are runtime or compile errors depending on phase.
AnyT introduces dynamic dispatch with runtime type tag.
• Arrays enforce bounds checking.
• Strings always store size + bytes and are heap-owned.


13. Grammar

program        := { statement } ;

statement      := constDecl
                | assignment
                | funcDecl
                | expr
                ;

assignment     := IDENT ':' expr ;

constDecl      := IDENT '::' expr
                | funcDecl ;

funcDecl       := IDENT ('::' | ':') funcSig ':' '(' body ')' ;

funcSig        := leftSig ('.' rightSig)? ('->' returnSig)? ;

leftSig        := '{' params? '}' ;
rightSig       := '{' params? '}' ;
returnSig      := '{' params? '}' ;

params         := param ( delim param )* ;
param          := IDENT ':' TYPE
                | TYPE ;

delim          := ',' | ';' | NEWLINE | WHITESPACE ;

expr           := condExpr ;

condExpr       := orExpr ('?' block ('|' block)*)? ;

orExpr         := andExpr ( '||' andExpr )* ;
andExpr        := compareExpr ( '&&' compareExpr )* ;
compareExpr    := addExpr ( ( '=' | '!=' | '<' | '>' | '<=' | '>=' ) addExpr )* ;
addExpr        := multExpr ( ('+' | '-') multExpr )* ;
multExpr       := unaryExpr ( ('*' | '/' | '%') unaryExpr )* ;

unaryExpr      := '!' unaryExpr
                | primary ;

primary        := literal
                | IDENT
                | primary '.' IDENT
                | IDENT ':' arglist
                | '(' expr ')'
                ;

arglist        := expr ( delim expr )* ;

14. Known Divergences From Docs

Flt is the real float type (not Dub).
• Array .get / .set are bounds-checked.
.a, .b, .c auto-fields exist for positional structs.
• Member access has the highest operator precedence.
• Some experimental operators (^, &, bitwise) are tokenized but not implemented.
• Default string indexing is byte-based, not Unicode code-point aware.


15. Conformance Criteria

A conformance suite MUST validate:

• colon disambiguation (assign vs call)
• left/right application rules
• struct access rules (.a, .b, .c)
• overloading dispatch correctness
• conditionals with multiple | branches
• for-loop header handling
• arrays bounds safety
• correct C++ generation for calls, struct creation, lambdas
• correctness of auto-free semantics on all string/array paths