Refulang Developer Diary 1 – Introduction and Invitation for Collaboration

Introduction

This post serves the purpose of introducing a pet project I have been working on and off since 2011. It is a strongly typed, lazily evaluated programming language that aims to be a hybrid between imperative and functional languages. It is written in C and is still in a very early fledgling state since it never managed to outgrow the status of a pet project of a single developer.

refulang_code.jpg

Until now. Through this and subsequent posts I would like to introduce Refulang to the world and openly invite anyone interested to collaborate with me in the project. Everything is on Github and in a state where developers from around the world can contribute to it by reading the code and opening Pull Requests and issues.

You can get the chance to learn a lot and experiment by contributing to the development of a programming language and have a say in many of its design choices.

Before you say it, yes I am aware of Rust and actively use it and love it. Back when I started this project Rust was at its fledgling stages and I did not know of its existence.

Some History

Refulang started as a silly project of mine, called Refu library where I was keeping a lot of common functionality I was using in all of my C projects. In time I got the idea to start developing a language and Refulang started forming. What used to be Refu library is now rfbase, a library with common functionality used inside the language as a submodule.

I unfortunately never had the time to work full time on the language and so I was always working either late at night or on weekends, so progress has been rather slow on the project. Additionally as it can be with such projects the code is probably rather ugly in some places. Regardless it is now in a state where what is there is very well tested, works and is well organized.

Some details about the language

The language is by no means perfectly defined at the moment. It compiles only on Linux but there are issues tracking porting effort for both macOS and Windows. There is no language specification apart from something I wrote long ago in orgmode but I believe any document at this point in time should be written in Markdown to encourage collaboration. So for now the code is the specification of the language. Still I can give a good description of the language’s current design and goals. Some of the following are not yet implemented.

From the very beginning Refulang aims at being a hybrid between imperative and functional languages but keeping first and foremost in mind the goal of being useable and understandable.

It is a curly braces language with a strong type system based on algebraic data types. It supports generics in the form of type parameters. Naturally it also implements pattern matching in order to deconstruct the algebraic data types. It compiles to LLVM bytecode.

type product_identifier {
    numeric_id:i64 | text_id:string
}

fn process_id(id:product_identifier)
    numeric_id:i64 => print("ID is a number: " numeric_id)
    tid:string => print("ID is ASCII: " tid)

fn main(args) -> u32
{
    id1 = product_identifier(5642)
    id2 = product_identifier("FF0AAAXYN")
    id3:product_identifier = if args[0] > 10 { "FFXEWQ01" } else { 64321 }

    process_id(id1)
    process_id(id2)
    process_id(id3)

    return 0
}

Above is a simple example with an algebraic data type, product_identifier, being instantiated in different ways and then deconstructed by an implicit match operator as part of the process_id() function.

fn use_array(arr:u64[6]) {
    // arr would only be evaluated when entering this function
    for i in arr {
        print(i)
    }
}

fn foo() {
    x = 0
    // since all values are known at compile time
    // the type of arr will be deduced to u64[6]
    arr = for a in 0:2:10 { x * 2 }

    // other code follows
    // ....
    // ....
    use_array(arr)
}

Above is another example where we can see lazy evaluation of an array using a for expression. The language aims to be lazily evaluated wherever possible in order to also allow for infinite data structures defined from the algebraic data types. I say aims, because this is a part of the language that is not yet implemented.

Furthermore Refu encourages programming to the interface by using typeclasses, a way to guarantee behaviour about objects of a specific type. Typeclasses act much like interfaces act in Java or traits in Rust. They are inspired by Haskell.

Refu programs are organized in modules that encompass specific functionality. Everything in a module is private by default unless explicitly exported. Each module can import objects and functions from other modules. Modules can also have signatures separated from their implementation. That is a module can have a single signature defining the type and interface of the module but also multiple implementations. As an example consider an IO module that implements I/O functionality for Linux, Windows, ARM or even javascript!

The memory model of the language (even though not perfectly defined yet) aims to give freedom to the developer when required but in most cases it will try to act invisibly. The memory model should be designed in such a way that the lifetime of most objects can be determined statically at compile time and proper optimization can occur. Rust performs such optimizations very well but requires the developer to explicitly define lifetimes and ownership via syntactic constructs. This decreases the usability of the language and makes for a much steeper learning curve. Refulang aims for as optimized code as possible without sacrificing ease of use. It’s all about trying to find a golden mean between useability and speed.

Current state of the code

The code is in Github at two different repositories. The main repository contains the entirety of the compiler code and uses rfbase C library as a submodule for functionality that could be easily abstracted for other projects too.

The codebase is organized into five distinct sections that correspond to the stages of the compilation pipeline.

  • Lexer: The lexer of the language which reads in the source and splits it into a number of lexical tokens.
  • Parser: A recursive descent parser that continuously reads in tokens fed to it from the lexer and formulates the Abstract Syntax Tree (AST).
  • Analyzer: The analyzer stage is one of the most important ones. This is where all the typechecking and correctness analysis happens.
  • Intermediate Representation: This is the stage where the RIR (Refu Intermediate Representation) is created. The typechecked code is converted into an intermediate format where both further analysis and conversion to final backend code is much easier.
  • Backend Code Generation: The final stage of compilation where the RIR is converted into backend executable code. The backend code generation is modular so that many different backends could be plugged in but for now the only backend possible is in LLVM.

How you can get involved

Refulang is still at an initial design and implementation level so there are many ways you can contribute.

  • You can contribute to the language design itself by participating in the discussion of how to design certain language features in the Github issues and in gitter.
  • You can read the code and get a better understanding of how the language works and then help write up more of the much needed documentation.
  • You can pick up any of the low hanging fruit issues, develop a solution for them and open a Pull Request. If you are feeling adventurous you can also check any of the other bigger issues.
  • If you have any feedback or comments you can open an issue in Github or come in gitter to discuss.

What makes Refulang exciting?

A language based on an algebraic data type system with lazy evaluation. An intuitive to use module system which can extend into a nice packaging system. A memory model that tries to optimize as much as possible without introducing too many concepts to the user but instead pushing all the work to the compiler and striving to be user friendly.

But first and foremost what makes Refulang exciting for someone at this point in time is its malleability. As a developer reading about, using and contributing to the development of Refulang you get the chance to shape a new programming language and guide its design and development.

What can you get out of it?

Getting involved with the development of Refulang at this point will not require a lot of your time (depending on your level of interest/commitment) and will give you the chance to:

  • Contribute to a new programming language from an early stage, participate in the language design process and have a hand in its creation.
  • Work in an open source project, being able to show what you are doing to everyone around the world.
  • Learn a lot about compilers and language design.
  • Participate in a cool project using the C language.

Conclusion

I hope you enjoyed this small introduction to Refu. It has been an extremely rewarding journey for me working on it so far but I now need help. Please join me in Github or gitter, bring new life to this project and let us together make an exciting new programming language.


About the Author

profile2.png

Lefteris Karapetsas is a passionate developer/tinkerer currently located in Berlin.

After graduating from the University of Tokyo, Lefteris has been developing backend software for various companies including Oracle and Acmepacket. He is an all-around tinkerer who loves to takes things apart and put them back together learning how they work in the process.

His interests include language/compiler design, Artifical Intelligence, Robotics, Systems programming, Distributed Systems and Blockchains. He feels at home with C code and GDB and tries to forward all that energy into the development of Refulang.

He has gained a lot of blockchain expertise by being part of Ethereum as a C++ core developer since its beginnings, having worked on Solidity, the ethash algorithm, the core client and the CI system. He had a hand in the creation of the DAO and in the cleanup after the hack. He is developing Sikorka, a system enabling people to use the Ethereum blockchain out in the real world. At the same time he is working with Brainbot AG as the project manager for Raiden. Raiden is bringing payment channels to Ethereum allowing vast scaling of the protocol by leveraging off-chain transactions.


Twitter: @lefterisjp Github: Lefterisjp contact: lefteris@refu.co

Leave a Reply

Your email address will not be published. Required fields are marked *