Timeouts on calls are, as the OP mentions, a thing in Erlang. Inter-process and inter-computer calls in QNX can optionally time out, and this includes all system calls that can block. Real-time programs use such features. Probably don't want it on more than that. It's like having exceptions raised in things you thought worked.
- Capabilities
They've been tried at the hardware level, and IBM used them in the System/38, but they never caught on. They're not really compatible with C's flat memory model, which is partly they fell out of fashion.
Capabilities mean having multiple types of memory. Might come back if partially-shared multiprocessors make a comeback.
- Production-Level Releases
That's kind of vague. Semantic versioning is a related concept. It's more of a tooling thing than a language thing.
- Semi-Dynamic Language
I once proposed this for Python. The idea was that, at some point, the program made a call that told the system "Done initializing". After that point, you couldn't load more code, and some other things that inhibit optimization would be prohibited. At that point, the JIT compiler runs, once. No need for the horrors inside PyPy which deal with cleanup when someone patches one module from another.
Guido didn't like it.
- Value Database
The OP has a good criticism of why this is a bad idea. It's an old idea, mostly from LISP land, where early systems saved the whole LISP environment state.
Source control? What's that?
- A Truly Relational Language
Well, in Python, almost everything is a key/value store. The NoSQL people were going in that direction. Then people remembered that you want atomic transactions to keep the database from turning to junk, and mostly backed off from NoSQL where the data matters long-term.
- A Language To Encourage Modular Monoliths
Hm. Needs further development. Yes, we still have trouble putting parts together.
There's been real progress. Nobody has to keep rewriting Vol. I of Knuth algorithms in each new project any more. But what's being proposed here?
- Modular Linting
That's mostly a hack for when the original language design was botched.
View this from the point of the maintenance programmer - what guarantees apply to this code? What's been prevented from happening? Rust has one linter, and you can add directives in the code which allow exceptions. This allows future maintenance programmers to see what is being allowed.
> Capabilities ... They're not really compatible with C's flat memory model ... Capabilities mean having multiple types of memory
C is not really dependent on a flat memory model - instead, it models memory allocations as separate "objects" (quite reniniscent of "object orientation in hardware" which is yet another name for capabilities), and a pointer to "object" A cannot be offset to point into some distinct "object" B.
> A Truly Relational Language
This is broadly speaking how PROLOG and other logic-programming languages work. The foundational operation in such languages is a knowledge-base query, and "relations" are the unifying concept as opposed to functions with predefined inputs and outputs.
> In general, while I can’t control how people react to this list, should this end up on, say, Hacker News, I’m looking more for replies of the form “that’s interesting and it makes me think of this other interesting idea” and less “that’s stupid and could never work because X, Y, and Z so everyone stop talking about new ideas” or “why hasn’t jerf heard of this other obscure language that tried that 30 years ago”. (Because, again, of course I don’t know everything that has been tried.)
- Everything except C now has standard strings, not just arrays of characters. Almost all languages now have some standard way to do key/value sets. What else ought to be standard?
-- Arrays of more than one dimension would be helpful for numerical work. Most languages descended from C lack this. They only have arrays of arrays. Even Rust lacks it. Proposals run into bikeshedding - some people want rectangular slices out of arrays, which means carrying stride info around.
-- Standard types for 2, 3 and 4-element vectors would help in graphics work. There are too many different implementations of those in most language and too much conversion.
Things to think about:
- Rust's ownership restrictions are harsh. Can we keep the safety and do more?
-- The back-reference problem needs to be solved somehow. Back references can be done with Rc and Weak, but it's clunky.
-- Can what Rust does with Rc, RefCell, and .borrow() be checked at compile time? That allows eliminating the run-time check, and provides assurance that the run-time check won't fail. Something has to look at the entire call tree at compile time, and sometimes it won't be possible to verify this at compile time. But most of the time, it should be.
-- There's a scheme for ownership where there's one owning reference and N using references. The idea is to verify at compile time that the using references cannot outlive the owning one. Then there's no need for reference counds.
-- Can this be extended to the multi-thread case? There have been academic demos of static deadlock detection, but that doesn't seem to have made it into production languages.
-- A common idiom involves things being owned by handles, but also indexed for lookup by various keys. Dropping the handle drops the object and removes it from the indices. Is that a useful general purpose operation? It's one that gets botched rather often.
-- Should compilers have SAT-solver level proof systems built in?
-- Do programs really have to be in monospaced fonts? (Mesa on the Alto used the Bravo word processor as its text editor. Nobody does that any more.)
-- There's async, there are threads, and there are "green threads", such as Go's "goroutines". Where's that going?
-- Can we have programs which run partly in a CPU and partly in a GPU, compiled together with the appropriate consistency checks, so the data structures and calls must match to compile?
-- How about "big objects?" These are separately built program components which have internal state and some protection from their callers. Microsoft OLE did that, some .dll files do that, and Intel used to have rings of protection and call gates to help with that, hardware features nobody used. But languages never directly supported such objects.
> Do programs really have to be in monospaced fonts?
Of course not. I've been using a proportional font for at least 10 years and I'm still in business working on code bases shared with developers using monospaced fonts. Both work, none disturb the other, proportional is easier to read as any book can demonstrate. Alignment doesn't matter much.
He proposes that there is a need for a way to connect modules, i.e. dependency injection, without the modules having explicit knowledge of each other, with compile-time verification that the modules being connected are compatible, without the interface song and dance.
I think squeak had Monticello for source control with their image based approach almost
20+ years ago and there was something else for smalltalk in the '80s too.
But yeah people like text and hate images, and I believe Pharo switched back to some git integration.
Smalltalk implementations have had text export/import for ages, and image based source control as you point out, is also quite old, Monticello wasn't the first.
I'm surprised these are called "programming language ideas". They seem to be solvable, at least many of them, with libraries. For example, my Haskell effect system Bluefin can be seen as a capability system for Haskell. My database library Opaleye is basically a relational query language for Haskell. Maybe I'm short-sighted but I haven't seen the need for a whole new language to support any of that functionality. In fact one gets huge benefits from implementing such things in an existing language.
One advantage (which is touched on in the logging section) is that having it provided by the language makes it clear what the default is, and sets expectations. Essentially, lifting it into the language is a way of coordinating the community.
I am also agreeing that relational approach to in-memory data is a good, efffective thought.
I recently compiled some of my C code with the sqlite database and I'm preparing to think how the SQL model of my standard code could be used as the actual implementation language of in memory operations.
Instead of writing the hundredth loop through objects I just write a SQL query instead with joining with seeing the internal data representation of the software as an information system instead of bespoke code.
I was hoping to make it possible to handle batches of data and add parallelism because arrays are useful when you want to parallelise.
I was thinking, wouldn't it be good if you could write your SQL queries in advance of the software and then parse them and then compile them to C code (using an unrolled loop of the SQLite VM) so they're performant. (For example, instead of a btree for a regular system operation, you can just use a materialised array a bit like a filesystem so you're not rejoining the same data all the time)
I was thinking of ways of representing actors somehow communicating by tables but I do not have anything concrete for that.
DataDraw is an ultra-fast persistent database for high performance programs written in C. It's so fast that many programs keep all their data in a DataDraw database, even while being manipulated in inner loops of compute intensive applications. Unlike slow SQL databases, DataDraw databases are compiled, and directly link into your C programs. DataDraw databases are resident in memory, making data manipulation even faster than if they were stored in native C data structures (really). Further, they can automatically support infinite undo/redo, greatly simplifying many applications.
I agree about relational languages. It's absurd when I think that SQL and Datalog came from the same foundations of relational calculus. It's just so much lost expressive power.
I really like what PRQL [1] did, at least it makes table operations easily chainable. Another one that comes to mind is Datomic [2].
I was struggling with doing interesting things with the semantic web circa 2007 and was thinking "OWL sucks" and looking at Datalog as an alternative. At that time Datalog was an obscure topic and hard to find information about it. 10 years later it was big.
(Funny after years of searching I found somebody who taught me how to do really complex modelling in OWL DL but from reading the literature I'm pretty sure the average PhD or prof in the field has no idea.)
I wrote up what I learned an a technical report that got sent to the editors at ISO a month or so ago and ought to appear pretty soon. Look up my profile and send me a note.
You might be interesting in looking at the Lima programming language: http://btetrud.com/Lima/Lima-Documentation.html . It has ideas that cover some of these things. For example, it's intended to operate with fully automatic optimization. This assumption allows shedding lots of complexity that arises from needing to do the same logical thing in multiple ways that differ in their physical efficiency characteristics. Like instead of having 1000 different tree classes, you have 1 and optimisers can then look at your code and decide what available tree structures make most sense in each place. Related to your async functions idea, it does provide some convenient ways of handling these things. While functions are just normal functions, it has a very easy way to make a block of async (using "thread") and provides means of capturing async errors that result from that.
For semi-dynamic language, Julia definitely took the approach of being a dynamic language that can be (and is) JITed to excellent machine code. I personally have some larger projects that do a lot of staged programming and even runtime compilation of user-provided logic using Julia. Obviously the JIT is slower to complete than running a bit of Lua or whatever, but the speed after that is phenomenal and there’s no overhead when you run the same code a second time. It’s pretty great and I’d love to see more of that ability in other languages!
Some of the other points resonate with me. I think sensible dynamic scoping would be an easy way to do dependency injection. Together with something like linear types you could do capabilities pretty smoothly, I think. No real reason why you couldn’t experiment with some persistent storage as one of these dependencies, either. Together with a good JIT story would make for a good, modular environment.
Oh and Zig is another option for allowing injections that are checked when used at a call site rather than predefined through interfaces.
AFAIK it doesn’t have closures (it’s too C-like) so you need to use methods for all your (implicit) interfaces, but that’s okay…
I think the “exemplars” could be automatically yoinked from documentation and tests and existing usage of the function in the code base. Work needs to be done on the IDE front to make this accessible to the user.
> Smalltalk and another esoteric programming environment I used for a while called Frontier had an idea of a persistent data store environment. Basically, you could set global.x = 1, shut your program down, and start it up again, and it would still be there.
Frontier! I played with that way back when on the Mac. Fun times.
But as for programming language with integrated database... MUMPS! Basically a whole language and environment (and, in the beginning, operating system) built around a built-in global database. Any variable name prefixed with ^ is global and persistent, with a sparse multi-dimensional array structure to be able to organize and access the variables (e.g. ^PEOPLE(45,"firstname") could be "Matthew" for the first name of person ID 45). Lives on today in a commercial implementation from Intersystems, and a couple Free Software implementations (Reference Standard M, GT.M, and the GT.M fork YottaDB). The seamless global storage is really nice, but the language itself is truly awful.
TADS, an OOP language + VM for interactive fiction, has this "value database" model. Once loaded into memory, the compiled image can be updated with values stored in a separate save file. The compiled image itself could store updated values as well.
In fact, it does this during a "preinit" stage that runs immediately after compilation. Once all preinit code finishes executing, the compiled image is overwritten with the updated state. The language includes a "transient" keyword to permit creating objects that should not be stored.
This same mechanism permits in-memory snapshots, which are used for the game's UNDO feature. No need to rewind or memento-ize operations, just return to a previous state.
It's not a general-purpose mechanism. After all, the language is for building games with multiple player-chosen save files, and to permit restarting the game from a known Turn 0 state.
Image persistence was one of the cool ideas of Smalltalk. And in practice, one of the biggest drawbacks. Cruft and old values accumulated steadily, with very little way to find and eliminate them. Transient execution has some cons. But on the pro side, every run starts from a "clean slate."
I believe it's just a git repo behind the scenes. Not sure if the UI exposes those things as I never used that in multi-developer scenarios!
Give it a go and see.
This may fall in the "you think you do, but you don't category", but I've always wanted a Smalltalk (or similar, not that picky) with a persistent virtual memory.
That is, the VM is mapped to a backing file, changes persisted automatically, no "saving", limited by drive space (which, nowadays, is a lot). But nowadays we also have vast memory space to act as a page cache and working memory.
My contrived fantasy use case was having a simple array name "mail", which an array containing all of my email messages (in email object, of course). Naturally as you get more mail, the array gets longer. Also, as you delete mail, then the array shifts. It's no different, roughly, than the classic mbox format, save it's not just text, its objects.
You can see if you delete a email, from a large (several GBs), there would be a lot of churn. That implies maybe it's not a great idea to use that data structure, but that's not the point. You CAN use that data structure if you like (just like you can use mbox if you like).
Were it to be indexed, that would be done with parallel data structures (trees or hashes or whatever).
But this is all done automagically. Just tweaks to pages in working memory backed by the disk using the virtual memory manager. Lots and lot of potential swapping. C'est la vie, no different from anything else. This what happens when you map 4TB into a 16GB work space.
The problem with such a system, is how fragile is potentially is. Corrupt something and it happily persists that corruption, wrecking the system. You can't reboot to fix it.
Smalltalk suffers from that today. Corrupt the image (oops, did I delete the Object become: method again?), and its gone for good. This is mitigated by having backup images, and the changelist to try to bring you back to the brink but no further.
I'm guessing a way to do that in this system is to use a copy on write facility. Essentially, snapshot the persistent store on each boot (or whatever), and present a list of previous snapshot at start up.
Given the structure of a ST VM you'd like to think this is not that dreadful to work up. I'd like to think a paper napkin implementation PoC would be possible, just to see what it's like. One of those things were the performance isn't really that great, but the modern systems are so fast, we don't really notice it in human terms.
Have you looked at Pharo? Their git integration makes it relatively easy to export and backup parts of your main image, and to pull the things back into a fresher one once you mess up.
The MUMPS database is wild. When I was working in MUMPS, it was so easy and fun to whip up an internal tool to share with my coworkers. You don't have to give any special thought at all to persistence, so you're able to stay in the flow of thinking about your business logic.
But as you said, the language itself is almost unbearable to use.
To me , this idea seems so so insane (especially for things like extraction , like you start extracting a zip on one device and it can be partially extracted and then you can partially extract it on the other) (yes sure , you could loop over each file and have a list of files currently unzipped and rather unzip the file which hasn't been unziped yet)
But Imagine if the file to be extracted is a singular file in zip (like 100 gig file)
I don't know , I have played this with criu and it had worked. Qemu can also work. But this idea is cool
Instead of using a default storage where entropy can hit , I would personally like it if the values were actually stored in sqlite and combined with Truly Relational Language maybe as well (but it doesn't truly require you to learn sqlite)
I had posted this on one of hackernews this as well and theoretically its possible with the brainfu* in sqlite intepreter that I had found. But I don't know.... If anybody knows of a new language / a method for integrating this in new languages , it would be pretty nice.
Oh my god , Another banger is the modular monolith part which I personally believe that it can be considered that java / kotlin ecosystem , golang with nats , elixir / erlang can be.
Another cool way is using encore in golang or typescript and then hosting the aws stack yourself or running encore locally I am not sure)
I think the coloured function problem boils down to the fact that async functions are not naturally a specific kind of sync function, but the other way around.
Functions are so ubiquitous we forget what they really are: a type of guarantee about the conditions under which the code within will run. Those guarantees include the availability of arguments and a place to put the return value (on the stack).
One of the key guarantees about sync functions is the call structure: one thread of execution will be in one function and one function only at any point during the program; the function will only be exited on return (or exception, or panic) or call of another function; and all the local data will be available only for the duration of that function call.
From that perspective, async functions are a _weakening_ of the procedural paradigm where it is possible to "leave behind" an instruction pointer and stack frame to be picked up again later. The ability to suspend execution isn't an additional feature, it's a missing guarantee: a generalisation.
There is always an interplay between expressiveness and guarantees in programming languages. Sometimes, it is worth removing a guarantee to create greater expressiveness. This is just an example of that.
I mentioned exceptions earlier — it's no wonder that exceptions and async both get naturally modelled in the same way (be it with monads or algebraic effects or whatever). They are both examples of weakening of procedural guarantees. Exceptions weaken the guarantee that control flow won't exit a function until it returns.
I think the practical ramifications of this are that languages that want async should be thinking about synchronous functions as a special case of suspendable functions — specifically the ones that don't suspend.
As a counterpoint, I can imagine a lot of implementation complexities. Hardware is geared towards the classical procedural paradigm, which provides an implementation foundation for synchronous procedures. The lack of that for async can partially explain why language authors often don't provide a single async runtime, but have this filled in by libraries (I'm thinking of Rust and Kotlin here).
Interesting that E is cited under “capabilities”, but not under “loosen up the functions”. E’s eventual-send RPC model is interesting in a number of ways. If the receiver is local then it works a bit like a JavaScript callback in that there’s an event loop driving execution; if it’s remote then E has a clever “promise pipelining” mechanism that can hide latency. However E didn’t do anything memorable (to me at least!) about handling failure, which was the main point of that heading.
For “capabilities” and “A Language To Encourage Modular Monoliths”, I like the idea of a capability-secure module system. Something like ML’s signatures and functors, but modules can’t import, they only get access to the arguments passed into a functor. Everything is dependency injection. The build system determines which modules are compiled with which dependencies (which functors are passed which arguments).
An existing “semi-dynamic language” is CLOS, the Common Lisp object system. Its metaobject protocol is designed so that there are clear points when defining or altering parts of the object system (classes, methods, etc.) at which the result is compiled, so you know when you pay for being dynamic. It’s an interesting pre-Self design that doesn’t rely on JITs.
WRT “value database”, a friend of mine used to work for a company that had a Lisp-ish image-based geospatial language. They were trying to modernise its foundations by porting to the JVM. He had horror stories about their language’s golden image having primitives whose implementation didn’t correspond to the source, because of decades of mutate-in-place development.
The most common example of the “value database” or image-based style of development is in fact your bog standard SQL database: DDL and stored procedures are very much mutate-in-place development. We avoid the downsides by carefully managing migrations, and most people prefer not to put lots of cleverness into the database. The impedance mismatch between database development by mutate-in-place and non-database development by rebuild and restart is a horribly longstanding problem.
As for “a truly relational language”, at least part of what they want is R style data frames.
My wild idea is that I'd like to see a modern "high-level assembler" language that doesn't have a callstack. Just like in the olden days, all functions statically allocate enough space for their locals. Then, combine this with some semi-convenient facility for making sure that local variables for a given function always fit into registers; yes, I admit that I'm strange when I say that I dream of a language that forces me to do manual register allocation. :P But mostly what I want to explore is if it's possible to create a ""modern"" structured programming language that maps cleanly to assembly, and that provides no optimization backend at all, but has enough mechanical sympathy that it still winds up fast enough to be usable.
> all functions statically allocate enough space for their locals.
Would you still have distinct activation records per call or forfeit the ability to have reentrant functions and recursion?
That's one of the main reasons to move to dynamic (as in a call stack) allocation of your activation records versus a single static allocation per function.
In this hypothetical language I'm assuming that recursion is unsupported and that if threading is supported at all, then each thread has its own copy of every function's locals (or at least every function that can be called concurrently; structured concurrency might be leveraged to prove that some functions don't need to be reentrant, or maybe you just chuck a mutex in each function prologue and YOLO). However, while enforcing that basic recursion is forbidden isn't too difficult (you make the language statically-typed, all names lexically-scoped, and don't support forward declarations), it does probably(?) mean that you also lose first-class functions and function pointers, although I haven't thought deeply about that.
Have you thought about what happens if you want to read and parse a file? Do you declare the maximum filesize you want to support and statically allocate that much memory?
I'm not intending to imply that the language I'm describing can't support heap-allocated memory; Rust shows us that it's even possible to do so without having to manually deallocate, if you're okay with a single-ownership discipline (which is a rather simple analysis to implement, as long as you don't also want a borrow checker along for the ride). Instead, this is about trying to make a language that makes it easy to keep locals in registers/cache, rather than relying on the compiler backed to do register allocation and hoping that your CPU can handle all that cache you're thrashing.
No, you have a scoped pointer to dynamically allocated memory; when the scoped pointer is destroyed/cleaned up/released at the end of the function, it releases the allocated memory.
A useful purpose for such a thing is in certain embedded, hard-real-time, or mission-critical scenarios.
Many such programming environments need strict control over stack sizes to avoid any possibility of stack overflow.
I had a similar notion a few years back, thinking about a somewhat wider range of "scoped guarantees". The compiler would compute things such as the maximum stack usage of a function, and this would "roll up" to call sites automatically. This could also be used to enforce non-usage of certain dangerous features such as locks, global flags, or whatever.
Why would you like to have this language? Is it about control over the execution? About better ways to personally optimize? Or just intellectual pleasure? Or is it about reliving the olden days of assembly language programming but with a modern conveniences?
I would simply find pleasure in being able to understand basically every level of the stack. For a RISC architecture, it's not too hard to get a grasp on how it works. Likewise for a simple-enough programming language. The problem(?) is that in between these two is an opaque black box--the optimization backend, which I feel I have no hope of understanding. So instead I wonder if it's possible to have a "safe" (safer than C) and "high-level" (more abstractive than C) language that is still useful and semi-performant, and I'm wondering how much ergonomics would need to be sacrificed to get there. It's a thought experiment.
We have built something that hits on points 1, 3, 5, and 7 at https://reboot.dev/ ... but in a multi-language framework (supporting Python and TypeScript to start).
The end result is something that looks a lot like distributed, persistent, transactional memory. Rather than explicit interactions with a database, local variable writes to your state are transactionally persisted if a method call succeeds, even across process/machine boundaries. And that benefits point 7, because transactional method calls compose across team/application boundaries.
[1] Loosen Up The Functions
[3] Production-Level Releases
[5] Value Database
[7] A Language To Encourage Modular Monoliths
They are related, for sure. But one of the biggest differences is that operations affecting multiple Reboot states are transactional, unlike Azure's "entity functions".
Because multiple Azure entity functions are not updated transactionally, you are essentially always implementing the saga pattern: you have to worry about cleaning up after yourself in case of failure.
In Reboot, transactional function calls automatically roll back all state changes if they fail, without any extra boilerplate code. Our hypothesis is that that enables a large portion of an application to skip worrying about failure entirely.
Code that has side-effects impacting the outside world can be isolated using our workflow mechanism (effectively durable execution), which can themselves be encapsulated inside of libraries and composed. But we don't think that that is the default mode that developers should be operating in.
Starlark, a variant of Python, can be thought of as semi dynamic: all mutation in each file happens once, single threaded, and then that file and all its data structures are frozen so downstream files can use it in parallel
A lot of "staged" programs can be thought of as semi dynamic as well, even things like C++ template expansion or Zig comptime: run some logic up front, freeze it, then run the rest of the application later
An interesting problem I've played around with fair bit is the idea of a maximally expressable non-Turing complete language, trying to make a language that is at least somewhat comfortable to use for many tasks, while still being able to make static assertions about runtime behavior.
The best I've managed is a functional language that allows for map, filter, and reduce, but forbids recursion or any other looping or infinite expansion in usercode.
The pitch is that this kind of language could be useful in contexts where you're executing arbitrary code provided by a potentially malicious third party.
Non-Turing-completeness doesn’t buy you that much, because you can still easily multiply runtime such that it wouldn’t terminate within your lifetime. With just map you can effectively build the cross product of a list with itself. Do that in an n-times nested expression (or nested, non-recursive function calls), and for a list of length k the result is a list of length kⁿ. And with reduce you could then concatenate a string with itself those kⁿ times, resulting in a string (and likely runtime and memory usage) of length 2^kⁿ.
If you want to limit the runtime, you need to apply a timeout.
I think you're asking for Starlark (https://starlark-lang.org), a language that strongly resembles Python but isn't Turing-complete, originally designed at Google for use in their build system. There's also Dhall (https://dhall-lang.org), which targets configuration use cases; I'm less familiar with it.
One problem is that, while non-Turing-completeness can be helpful for maintainability, it's not really sufficient for security. Starlark programs can still consume exponential amounts of time and memory, so if you run an adversary's Starlark program without sandboxing it, you're just as vulnerable to denial-of-service attacks as you'd be with a Turing-complete language. The most common solution is sandboxing, wherein you terminate the program if it exceeds time or memory limits; however, once you have that, it's no longer necessary for the language to not be Turing-complete, so you might as well use a popular mainstream language that's easy to sandbox, like JavaScript.
One other intriguing option in the space is CEL (https://cel.dev), also designed at Google. This targets use cases like policy engines where programs are typically small, but need to be evaluated frequently in contexts where performance matters. CEL goes beyond non-Turing-completeness, and makes it possible to statically verify that a program's time and space complexity are within certain bounds. This, combined with the lack of I/O facilities, makes it safe to run an adversary's CEL program outside a sandbox.
If you're interested in prior art, Ian Currie's NewSpeak was an attempt at a non-Turing complete language for safety critical systems. Most of the search results are for a different language with the same name, but "RSRE currie newspeak" should find relevant links.
Well OP, are you me? everything you listed is also in my
short wishlist for a programming language (well except for the value database, once to you have first class relational tables in your language, persistence can be tied to the table identity, doesn't need to be implicit).
Capabilities and dynamic scoping for "modularisation" nicely lead to implicit variables instead of truly global dynamically scoped variables. Implicit variables also probably work well to implement effect systems which means well behaved asyncs.
Edit: other features I want:
- easy embedding in other low level languages (c++ specifically)
- conversely, easy to embed functions written in another language (again c++).
- powerful, shell-like, process control system (including process trees and pipelines), including across machines.
- built-in cooperative shared memory concurrency, and preemptive shared nothing.
The issue with the while is that more often than not you need to do some preparations before the condition. So you need to move that to a function, or duplicate it before and inside the loop. Do-while doesn't help, since with that you can't do anything after the condition.
The alternative is a while(true) with a condition in the middle.
while(true){
prepare;
if(!check) break;
process
}
But what if there was a language construct for this? Something like
do{prepare}while(condition){process}
Is there a language that implements this somehow? (I'm sure there is, but I know no one)
The best thing is that this construct can be optimized in assembly perfectly:
...
jump-always > start
after:
process
start:
prepare
condition
branch-if-true > after
...
Evaluates foo, then bar(s), and returns the result of evaluating foo and discards the results of bar(s).
Useful if `foo` is the condition and you need to perform some change to it immediately after, eg:
(while (prog1 (< next prev) (setq prev next)) ...)
---
(prog2 foo bar baz*)
Evaluates foo, then bar, then baz(s) (if present), returns the result of evaluating bar and discards the results of evaluating foo and baz(s).
Might be what GP wants. `foo` is the preparation, `bar` is the condition`, and `baz` can be some post-condition mutation on the compared value. Not too dissimilar to
for (pre, cond, post) {}
With `prog2` you could achieve similar behavior with no built in `for`:
(while (prog2 pre cond post) ...)
---
(progn foo*)
Evaluate each foo in order, return the result of evaluating the last element of foo and discard all the others.
`progn` is similar to repeated uses of the comma operator in C, which GP has possibly overlooked as one solution.
In some way it's the dual of break, in that you want to jump into the middle of the loop, while break is to jump out of it.
Let's rewrite the loop this way, with 'break' expanded to 'goto':
while (true) {
prepare...
if (!cond) goto exitpoint;
process...
}
exitpoint:
The dual would be:
goto entrypoint;
do {
process...
entrypoint:
prepare...
} while(cond);
Both constructs need two points: where the jump begins and where it lands. The 'break' is syntactic sugar that removes the need to specify the label 'exitpoint'. In fact with 'break' the starting point is explicit, it's where the 'break' is, and the landing point is implicit, after the closing '}'.
If we want to add the same kind of syntactic sugar for the jump-in case, the landing point must be explicit (no way for the compiler to guess it), so the only one we can make implicit is the starting point, that is where the 'do' is.
So we need: a new statement, let's call it 'entry', that is the dual of 'break' and a new semantic of 'do' to not start the loop at the opening '{' but at 'entry'.
do {
process...
entry;
prepare...
} while (cond);
Is it more readable than today's syntax? I don't know...
do {
Get(Current_Character);
if (Current_Character == '*') break;
print(Current_Character);
} while (true);
I don't see why this needs a new construct in languages that don't already have it. It's just syntactic sugar that doesn't actually save any work. The one with the specialized construct isn't really any shorter and looks pretty much the same. Both have exactly one line in the middle denoting the split. And both lines look really similar anyway.
C-style for-loop is kinda sorta this. Although the "prepare" part has to be an expression rather than a statement, given that you have the comma operator and ?: you can do a lot there even in C. In C++, you can always stick a statement in expression context by using a lambda. So:
for ([]{
/*prepare*/
}(); /*condition*/;) {
/*body*/
}
However, the most interesting take on loops that I've seen is in Sather, where they are implemented on top of what are, essentially, coroutines, with some special facilities that make it possible to exactly replicate the semantics of the usual `while`, `break` etc in this way: https://www.gnu.org/software/sather/docs-1.2/tutorial/iterat...
Well, I am in a process of making a language where general loops will look like
loop
prepare;
while check;
process;
end;
I also think you'd enjoy Knuth's article "Structured Programming with go to Statements" [0]. It's the article that gave us the "premature optimization is the root of all evil" quote but it's probably the least interesting part of it. Go read it, it has a several sections that discuss looping constructs and possible ways to express it.
do {
let value = prepare();
} while (value.is_valid) {
process(value);
}
Can the second block of the do-while see `value` in its lexical scope? If yes, you have this weird double brace scope thing. And if no, most non-trivial uses will be forced to fall back to `if (...) break;` anyway, and that's already clear enough imo.
The scope should be unique, yes. In your example value should be visible.
Your are right about the word double braces, but I can't think of an alternate syntax other than just removing the braces around the while. But in that case it may seem odd to have a keyword that can only be used inside a specific block...wich is basically a macro for a if(.)break;
Maybe I'm too used to the c/java syntax, maybe with a different way of defining blocks?
That seems more like a programmer expectations issue than something fundamental. Essentially, you have "do (call some function that returns a chunk of state) while (predicate that evaluates the state) ..."
Hard to express without primitives to indicate that, maybe.
You mean like a shell's while-do-done? It's just about allowing statements as the conditions, rather than just a single expression. Here's an example from a repl I wrote:
repl_prompt="${repl_prompt-repl$ }"
while
printf "%s" "$repl_prompt"
read -r line
do
eval "$line"
done
echo
The `printf` is your `prepare`.
This should also be doable in languages where statements are expressions, like Ruby, Lisp, etc.
Here's a similar Ruby repl:
while (
print "repl> "
line = gets
) do
result = eval line
puts result.inspect
end
puts
Exactly, here you are basically keeping it as a while with a condition but allowing it to be any code that at the end returns a boolean, although you need to make sure that variables defined in that block can be used in the do part.
Sidenote: I wasn't aware that shell allows for multiple lines, good to know!
I don't write a lot of while loops so this is just a bit unfamiliar to me, but I'm not really understanding how this isn't the same as `do{block}while(condition);`? Could you give a simple example of what kind of work `prepare` is doing?
Think of a producer (a method that returns data each time you request one, like reading a file line by line or extracting the top of a queue for example) that you need to parse and process until you find a special element that means "stop".
I'm aware this example can be trivially replaced with a while(data=parse(producer.get())){process(data)} but you are forced to have a method, and if you need both the raw and parsed data at the same time, either you mix them into a wrapper or you need to somehow transfer two variables at the same time from the parse>condition>process
A do-while here also has the same issue, but in this case after you check the condition you can't do any processing afterwards (unless you move the check and process into a single check_and_process method...which you can totally do but again the idea is to not require it)
Yeah, `while...else` in Python does the wrong thing. Executes `else` block when the loop finished normally (not through `break`).
Scala for example has a `breakable {}` block that lets you indicate where you should land after a `break`
breakable {
while condition {
// loop body
if otherCondition then break;
// rest of the body
}
// body of pythonic else block
} // you land here if you break
However I have no idea how to implement the kind of `else` I described in any language without checking the condition twice.
PowerShell can process 0..n input objects from the pipeline using BEGIN {...} PROCESS {...} END {...} blocks.
I find this so incredibly useful, that I miss it from other languages.
Something related that I've noticed with OO languages such as Java is that it tends to result in "ceremony" getting repeated n-times for processing n objects. a well-designed begin-process-end syntax for function calls over iterables would be amazing. This could apply to DB connection creation, security access checks, logging, etc...
This runs all 3 in order every iteration but quits if condition evaluates to false. It just uses the fact that value of a block is the value of the last expression in the block.
Scala has a lot of syntax goodies although some stuff is exotic. For example to have a 'break' you need to import it and indicate where from exactly you want to break out of.
> What about a language where for any given bit of code, the dynamicness is only a phase of compilation?
This is (essentially) Crystal lang's type system. You end up with semantic analysis/compilation taking a significant amount of time, longer than other comparable languages, and using a lot of resources to do so.
"Semi-dynamic" is one of the most common architectures there is for large & complex systems. AAA games are usually written in a combination of C++ and a scripting language. GNU Emacs is a Lisp application with a custom interpreter that is optimized for writing a text editor. Python + C is a popular choice as well as Java + Groovy or Clojure, I've even worked with a Lua + FORTRAN system.
I also think "parsers suck". It should be a few hundred lines at most, including the POM file, to add an "unless" statement to the Java compiler. You need to (1) generate a grammar which references the base grammar and adds a single production, (2) create a class in the AST that represents the "unless" statement and (3) add an transformation that rewrites
unless(X) {...} -> if(!X) {...}
You should be able to mash up a SQL grammar and the Java grammar so you can write
var statement = <<<SELECT * FROM page where id=:pageId>>>;
this system should be able to export a grammar to your IDE. Most parser generators are terribly unergonomic (cue the event-driven interface of yacc) and not accessible to people who don't have a CS education (if you need a bunch of classes to represent your AST shouldn't these get generated from your grammar?) When you generate a parser you should get an unparser. Concrete syntax trees are an obscure data structure but were used in obscure RAD tools in the 1990s that would let you modify code visually and make the kind of patch that a professional programmer would write.
The counter to this you hear is that compile time is paramount and there's a great case for that in large code bases. (I had a system with a 40 minute build) Yet there's also a case that people do a lot of scripty programming and trading compile time for a ergonomics can be a win (see Perl and REBOL)
I think one goal in programming languages is to bury Lisp the way Marc Anthony buried Caesar. Metaprogramming would be a lot more mainstream if it was combined with Chomksy-based grammars, supported static typing, worked with your IDE and all that. Graham's On Lisp is a brilliant book (read it!) that left me disappointed in the end because he avoids anything involving deep tree transformations or compiler theory: people do much more advanced transformations to Java bytecodes. It might be easier to write those kind of transformations if you had an AST comprised of Java objects instead of the anarchy of nameless tuples.+
I love these ideas! I've been thinking about the "fully relational" language ever since I worked with some product folks and marketers at my start up 15 years ago who "couldn't code" but were wizards at cooking up SQL queries to answer questions about what was going on with our users and product. There was a language written in rust, Tablam[0] that I followed for a while, which seemed to espouse those ideas, but it seems like it's not being owrked on anymore. And Jamie from Scattered Thoughts[1] has posted some interesting articles in that direction as well. He used to work on the old YC-company/product LightTable or Eve or something, which was in the same space.
I've also always thought Joe Armstrong's (RIP) thought of "why do we need modules" is really interesting, too. There's a language I've seen posted on HN here a couple times that seems to go in that approach, with functions named by their normalized hash contents, and referred to anywhere by that, but I can't seem to remember what it's called right now. Something like "Universe" I think?
> with functions named by their normalized hash contents, and referred to anywhere by that, but I can't seem to remember what it's called right now. Something like "Universe" I think?
I think the problem with "big" language ideas is, that as long as they match exactly your needs, they're great, but if they're slightly off, they can be a pain in the ass.
I'm wondering if languages could provide some kind of meta information, hooks or extension points, which could be used to implement big ideas on top. These big ideas could then be reused and modified depending on the needs of the project.
In which Jerf longs for PHP.
Every single point has been in, and actively used, for a long while.
The __call() & friends is particularly nifty - simple mental model, broad applicability, in practice used sparingly to great effect.
The section about language support for modular monoliths reminds me of John Lakos's "Large-Scale C++ Software Design", which focuses on the physical design/layout of large C++ projects to enforce interfaces and reduce coupling and compilation time. Example recommendations include defining architecture layers using subdirectories and the PImpl idiom. It's pretty dated (1996, so pre-C++98), but still a unique perspective on an overlooked topic.
The only thing I want added to every programming language I use is the ability to call functions and handle data structures provided by libraries and services written in other languages without me having to write arcane wrappers.
It's not very convincing to me when the article talks about truly relational language but fails to mention Prolog and anything that we learned from it.
Logic languages are definitely not what I'd expect a relational-first language to look like.
What we learned from Prolog is mostly that starting from an exponentially-complex primitive and then trying to beat it into submission doesn't work at scale. Relational DBs don't have that problem. They do go n-squared and n-cubed and so forth easily, but there are lots of solutions to that as well.
I'm not sure what you mean with "an exponentially-complex primitive". In my opinion, Prolog lets you start with simple relations (n-squared, using your terms) and then enables you to build more complex relations using them.
Thanks - this was one of the more interesting things I've read here in a while.
I wonder if "Programming languages seem to have somewhat stagnated to me.", a sentiment I share, is just me paying less attention to them or a real thing.
I think there is innovation, but there's more than innovation required to be a good language. If a innovative feature is the cornerstone of a language, it frequently means that the language neglects pragmatic coding features that while not particularly special contribute to the language being nice to use.
I feel like in the next few years in languages will be things like Rust descendants where people with experience in using Rust want to keep what works for them but scales back some of the rigidity in favour of pragmatism.
It's also with noting that there are existing languages that are also changing over time. Freepascal has developed a lot of features over the years that make it fairly distant from original Pascal. More recent languages like Haxe are still developing into their final form. TypeScript has gone from a language that provided a tangible solution to an existing problem to a quagmire of features that I'd rather not have.
for "Semi-Dynamic Language" it might be worth looking into rpython: interpreters written in rpython have two phases, in the first phase one has full python semantics, but in the second phase everything is assumed to be less dynamic, more restricted (the r of rpython?) so the residual interpreter is then transpiled to C sources, which, although compiled, can also make use of the built-in GC and JIT.
I immediately thought of Julia as a semi-dynamic language. Julia is a dynamic language, but (as I understand it) the first time a function is called with a specific type signature, that specific method is JIT compiled as static LLVM.
Which is then used for future dispatches on that same signature and gives it very good performance. Julia is dynamic, and definitely beats the 10x slower than C barrier jerf mentioned.
For what I was using it for at the time (~3 years ago when I used it seriously) it offered performance close to the compiled orbital analysis code we had (in more conventional languages, Fortran and C) but with the flexibility for developing models of Python and other dynamic/interactive languages. An excellent tradeoff: very small performance cost for better interactivity and flexibility.
I helped on a language called Eve about 10 years ago. A truly relational language was exactly what that language was supposed to be, or at least that's what we were aiming at as a solution for a user-centric programming language.
The language we came up with was sort of like Smalltalk + Prolog + SQL. Your program was a series of horn clauses that was backed by a Entity-Attribute-Value relational database. So you could write queries like "Search for all the clicks and get those whose target is a specific id, then as a result create a new fact that indicates a a button was pressed. Upon the creation of that fact, change the screen to a new page". We even played around with writing programs like this in natural language before LLMs were a thing (you can see some of that here https://incidentalcomplexity.com/2016/06/10/jan-feb/)
It's very declarative, and you have to wrap you brain around the reactivity and working with collections of entities rather than individual objects, so programming this way can be very disorienting for people used to imperative OOP langauges.
But the results are that programs are much shorter, and you get the opportunity for really neat tooling like time travel debugging, where you roll the database back to a previous point; "what-if" scenarios, where you ask the system "what would happen if x were y" and you can potentially do that for many values of y; "why not" scenarios, where you ask the system why a value was not generated; value providence, where you trace back how a value was generated... this kind tooling that just doesn't exist with most languages due to how they languages are built to throw away as much information away as possible on each stage of compilation. The kind of tooling I'm describing requires keeping and logging information about your program, and then leveraging it at runtime.
Most compilers and runtimes throw away that information as the program goes through the compilation process and as its running. There is a cost to pay in terms of memory and speed, but I think Python shows that interpretation speed is not that much of a barrier to language adoptions.
But like I said, that was many years ago and that team has disbanded. I think a lot of what we had in Eve still hasn't reached mainstream programming, although some of what we were researching found its way into Excel eventually.
> Loosen Up The Functions... Capabilities... Production-Level Releases... Semi-Dynamic Language... Modular Monoliths
I really like where the author's head at, I think we have similar ideas about programming because I've been developing a language called Mech that fits these descriptors to some degree since Eve development was shut down.
So this language is not supposed to be relational like Eve, but it's more like Matlab + Python + ROS (or Erlang if you want to keep it in the languages domain).
I have a short 10 min video about it here: https://www.hytradboi.com/2022/i-tried-rubbing-a-database-on... (brief plug for HYTRADBOI 2025, Jamie also worked on Eve, and if you're interested in the kinds of thing the author is, I'm sure you'll find interesting videos at HYTRADBOI '22 archives and similarly interested people at HYTRADBOI '25), but this video is out of date because the language has changed a lot since then.
Mech is really more of a hobby than anything since I'm the only one working on it aside from my students, who I conscript, but if anyone wants to tackle some of these issues with me I'm always looking for collaborators. If you're generally interested in this kind of stuff drop by HYTRADBOI, and there's also the Future Of Coding slack, where likeminded individuals dwell: https://futureofcoding.org. You can also find this community at the LIVE programming workshop which often coincides with SPLASH: https://liveprog.org
I remember both LightTable and Eve. At the time I thought they were both really interesting ideas but wasn't sure where they were going.
Re-reading the eve website now, with 10+ years more experience and understanding of languages I'm really astounded at how brilliant Eve was, and how far ahead of it's time it was (and still is). Also at how rare it is to have any revolutionary ideas in modern programming language design make it out of theory in contemporary times. There were many radical ideas in the 60 and 70s, but so much now is incremental.
It's a shame Eve couldn't continue, just to see what it would've become and the influence it would have had on language expectations. Really cool stuff in there. While not likely, I hope someone picks up those ideas and continues them.
Did the effort just run out of funding? Or did it hit a stumbling block?
As far as "semi-dynamic" goes, C# has an interesting take coming from the other direction - i.e. a fully statically typed language originally bolting dynamic duck typing later on.
It's done in a way that allows for a lot of subtlety, too. Basically you can use "dynamic" in lieu of most type annotations, and what this does is make any dispatch (in a broad sense - this includes stuff like e.g. overload resolution, not just member dispatch) on that particular value dynamic, but without affecting other values involved in the expression.
> Some Lisps may be able to do all this, although I don’t know if they quite do what I’m talking about here; I’m talking about there being a very distinct point where the programmer says “OK, I’m done being dynamic” for any given piece of code.
In Common Lisp there are tricks you can pull like declaring functions in a lexical scope (using labels or flet) to remove their lookup overhead. But CL is generally fast enough that it doesn't really matter much.
You can declaim inline a toplevel function. That doesn't necessarily mean that it will be integrated into callers. Among the possible effects is that the dynamism of reference can be culled away. If a function A calls B where B is declaimed inline then A can be compiled to assume that B definition. (Such that if B is redefined at run-time, A can keep calling the old B, not going through the #'B function binding lookup.).
I seem to remember that Common Lisp compilers are allowed to do this for functions that are in the same file even if they are not declaimed inline. If A and B are in the same file, and B is not declaimed notinline (the opposite of inline), then A can be translated to assume the B definition.
So all your helper functions in a Lisp module are allowed to be called more efficiently, not having to go through the function binding of the symbol.
For relational, look into term-rewriting systems which just keep transforming specified relationships into other things. Maude’s rewriting logic and engine could probably be used for relational programming. It’s fast, too.
As for "capabilities", I'm not sure I fully understand how that is advantageous to the convention of passing the helper function ("capability") as an argument to the "capable" function.
For instance, in Zig, you can see that a function allocates memory (capability) because it requires you to pass an allocator that it can call!
I'd like to see if others are more creative than me!
In Zig it's conventional to pass an allocator, but any code can end run around the convention by reaching for page_allocator or c_allocactor behind your back. Capabilities upgrade that convention into a guarantee.
That's pretty much how it plays out, as I understand it.
The trick is making sure that that object is the Only possible way to do the thing. And making more features like that, for example Networking, or File I/O, etc
Totally agree that programming languages are a bit stagnant, with most new features being either trying to squeeze a bit more correctness out via type systems (we're well into diminishing returns here at the moment), or minor QoL improvements. Both are useful and welcome but they aren't revolutionary.
That said, here's some of the feedback of the type you said you didn't want >8)
(1) Function timeouts. I don't quite understand how what you want isn't just exceptions. Use a Java framework like Micronaut or Spring that can synthesize RPC proxies and you have things that look and work just like function calls, but which will throw exceptions if they time out. You can easily run them async by using something like "CompletableFuture.supplyAsync(() -> proxy.myCall(myArgs))" or in Kotlin/Groovy syntax with a static import "supplyAsync { proxy.myCall(myArgs) }". You can then easily wait for it by calling get() or skip past it. With virtual threads this approach scales very well.
The hard/awkward part of this is that APIs are usually defined these days in a way that doesn't actually map well to standard function calling conventions because they think in terms of POSTing JSON objects rather than being a function with arguments. But there are tools that will convert OpenAPI specs to these proxies for you as best they can. Stricter profiles that result in more idiomatic and machine-generatable proxies aren't that hard to do, it's just nobody pushed on it.
(2) Capabilities. A language like Java has everything needed to do capabilities (strong encapsulation, can restrict reflection). A java.io.File is a capability, for instance. It didn't work out because ambient authority is needed for good usability. For instance, it's not obvious how you write config files that contain file paths in systems without ambient authority. I've seen attempts to solve this and they were very ugly. You end up needing to pass a lot of capabilities down the stack, ideally in arguments but that breaks every API ever designed so in reality in thread locals or globals, and then it's not really much different to ambient authority in a system like the SecurityManager. At least, this isn't really a programming language problem but more like a standard library and runtime problem.
(3) Production readiness. The support provided by app frameworks like Micronaut or Spring for things like logging is pretty good. I've often thought that a new language should really start by taking a production server app written in one of these frameworks and then examining all the rough edges where the language is mismatched with need. Dependency injection is an obvious one - modern web apps (in Java at least) don't really use the 'new' keyword much which is a pretty phenomenal change to the language. Needing to declare a logger is pure boilerplate. They also rely heavily on code generators in ways that would ideally be done by the language compiler itself. Arguably the core of Micronaut is a compiler and it is a different language, one that just happens to hijack Java infrastructure along the way!
What's interesting about this is that you could start by forking javac and go from there, because all the features already exist and the work needed is cleaning up the resulting syntax and semantics.
(4) Semi-dynamic. This sounds almost exactly like Java and its JIT. Java is a pretty dynamic language in a lot of ways. There's even "invokedynamic" and "constant dynamic" features in the bytecode that let function calls and constants be resolved in arbitrarily dynamic ways at first use, at which point they're JITd like regular calls. It sounds very similar to what you're after and performance is good despite the dynamism of features like lazy loading, bytecode generated on the fly, every method being virtual by default etc.
(5) There's a library called Permazen that I think gets really close to this (again for Java). It tries to match the feature set of an RDBMS but in a way that's far more language integrated, so no SQL, all the data types are native etc. But it's actually used in a mission critical production application and the feature set is really extensive, especially around smooth but rigorous schema evolution. I'd check it out, it certainly made me want to have that feature set built into the language.
(6) Sounds a bit like PL/SQL? I know you say you don't want SQL but PL/SQL and derivatives are basically regular programming languages that embed SQL as native parts of their syntax. So you can do things like define local variables where the type is "whatever the type of this table column is" and things like that. For your example of easily loading and debug dumping a join, it'd look like this:
DECLARE
-- Define a custom record type for the selected columns
TYPE EmpDept IS RECORD (
name employees.first_name%TYPE,
salary employees.salary%TYPE,
dept departments.department_name%TYPE
);
empDept EmpDept;
BEGIN
-- Select columns from the joined tables into the record
SELECT e.first_name, e.salary, d.department_name INTO empDept
FROM employees e JOIN departments d ON e.department_id = d.department_id
WHERE e.employee_id = 100;
-- Output the data
DBMS_OUTPUT.PUT_LINE('Name: ' || empDept.name);
DBMS_OUTPUT.PUT_LINE('Salary: ' || empDebt.salary);
DBMS_OUTPUT.PUT_LINE('Department: ' || emptDebt.name);
END;
It's not a beautiful language by any means, but if you want a natively relational language I'm not sure how to make it moreso.
(7) I think basically all server apps are written this way in Java, and a lot of client (mobile) too. It's why I think a language with integrated DI would be interesting. These frameworks provide all the features you're asking for already (overriding file systems, transactions, etc), but you don't need to declare interfaces to use them. Modern injectors like Avaje Inject, Micronaut etc let you directly inject classes. Then you can override that injection for your tests with a different class, like a subclass. If you don't want a subtyping relationship then yes you need an interface, but that seems OK if you have two implementations that are really so different they can't share any code at all. Otherwise you'd just override the methods you care about.
Automatically working out the types of parameters sounds a bit like Hindley-Milner type inference, as seen in Haskell.
(8) The common way to do this in the Java world is have an annotation processor (compiler plugin) that does the lints when triggered by an annotation, or to create an IntelliJ plugin or pre-canned structural inspection that does the needed AST matching on the fly. IntelliJ's structural searches can be saved into XML files in project repositories and there's a pretty good matching DSL that lets you say things like "any call to this method with arguments like that and which is inside a loop should be flagged as a warning", so often you don't need to write a proper plugin to find bad code patterns.
I realize you didn't want feedback of the form "but X can do this already", still, a lot of these concepts have been explored elsewhere and could be merged or refined into one super-language that includes many of them together.
- I like the idea of a multiparadigm programming language (many exists) but where you can write part of the code in a different language, not trying to embed everything in the same syntax. I think in this way you can write code and express your ideas differently.
- A [social] programming language where some variables and workflows are shared between users [1][2].
- A superreflective programming language inspired by Python, Ruby, and others where you can override practically everything to behave different. For example, in Python you can override a function call for an object but not for the base system, globals() dict cannot be overriden. See [3]. In this way you save a lot of time writing a parser and the language basic logic.
- A declarative language to stop reinventing the wheel: "I need a website with a secure login/logout/forgot_your_password_etc, choose a random() template". It doesn't need to be in natural language though.
Egont sounds a bit like SQL, no? A social way to share data and work with it ... a shared RDBMS where everyone has a user account and can create tables/share them with other users, built in security, etc. Splat a GUI on top and you have something similar.
Modern web frameworks are getting pretty declarative. If you want a basic web app with a log in/out page that's not hard to do. I'm more familiar with Micronaut than Spring but you'd just add:
micronaut.security.authentication=cookie
and the relevant dependencies. Now you write a class that checks the username/password, or use LDAP, or configure OAuth and the /login URL takes a POST of username/password. Write a bit of HTML that looks good for your website and you're done.
> Egont sounds a bit like SQL, no? A social way to share data and work with it ... a shared RDBMS where everyone has a user account and can create tables/share them with other users, built in security, etc. Splat a GUI on top and you have something similar.
Yes, SQL or a global spreadsheet. I would say that it is like SQL plus a DAG or, we can imagine an aggregation of SQLs. The interesting thing is that parts of the global system are only recalculated if there is a change, like in a spreadsheet.
> a shared RDBMS where everyone has a user account and can create tables/share them with other users, built in security, etc. Splat a GUI on top and you have something similar.
We need a little bit more but not much more: security by namespaces and/or rows so the same database is shared but you can restrict who change what: your "rows" are yours. I think something like OrbitDB but with namespaces will be cool.
> Modern web frameworks are getting pretty declarative.
Yes but my proposal was at a higher level. I don't want to know what a cookie is when I just want to create a website. I am not saying that you can create complex components with this idea but you can create common use cases.
It is not. I didn't want to give a half explanation, but it is another case of the increasing difficulty in coming up with good Google searches anymore.
But you use capabilities all the time... operating system users work that way. As a user, you can't "just" execute some binary somewhere and thereby get access to parts of the system your user doesn't have rights to. (Forget setuid for a second, which is intended precisely to get around this, and let's just look at the underlying primitive.)
Capabilities in programming languages take the granularity further down. You might call some image manipulation code in a way that it doesn't have the capability to manipulate the file system in general, for example, or call a function to change a user's login name with capabilities that only allow changing that user, even if another user ID somehow gets in there.
It would be a fairly comprehensive answer to the software dependency issues that continue to bubble up; it would matter less if a bad actor took over "leftpad" if leftpad was actively constrained by the language to only be able to manipulate strings, so the worst an actor could do is make it manipulate strings the wrong way, rather than start running arbitrary code. Or put another way, if the result of the bad actor taking the package wasn't that people got hacked but users started getting
compile error in file.X:28: library "leftpad" tried to open a file without file system capabilities
compile error in file.X:30: library "leftpad" tried to open a socket without network capabilities
which would immediately raise eyebrows.
It's not a new idea, in that E already tried it, and bits and pieces of it are everywhere ("microkernels" is another place where you'll see this idea, but at the OS level and implemented in languages that have no native concept of the capabilities), but for the most part our programming languages do not reflect this.
> But you use capabilities all the time... operating system users work that way.
Most operating systems don't have proper capabilities - they use things like ACLS, RBAC, MAC, etc for permissions.
The golden rule of capabilities is that you should not separate designation from authority. The capability itself represents the authority to access something, and designates what is being accessed.
For the equivalent in operating systems land, look at the respective manual pages for Linux capabilities[1] or OpenBSD pledge[2] and unveil[3]. The general idea is that there are some operations that might be dangerous, and maybe we don't want our program to have unrestricted access to them. Instead, we opt-in to the subset that we know we need, and don't have access to the rest.
There's some interest in the same thing, but at the programming language level. I'm only aware of it being implemented academically.
I don't think that Linux capabilities have much to do with the capabilities that the OP intends.
In a capabilities system, a program has permission to act on any object if it has a reference (aka a capability) to the object, there is no other access control. A program acquires a capability either by receiving it from is parent (or caller in the case of a function) or some other way like message passing. There is no other source of capabilities and they are unforgeable.
Unix file descriptors act in many ways as capabilities: they are inherited by processes from their parents and can be passed around via Unix sockets, and grant to the FD holder the same permissions to the referenced object as the creator of the file descriptor.
Of course as Unix has other ways from creating file descriptors other than inheritance and message passing is not truly a capabilities system.
It's implemented in Java! .NET tried it too, UNIX file descriptors are capabilities, Mach ports are capabilities. Capabilities are widely used far outside of academia and have been for a long time.
What people often mean when they say this is a so-called pure capability system, where there are no ambient permissions at all. Such systems have terrible usability and indeed have never been made to work anywhere, not even in academia as far as I know.
> This is not a new idea, so I won’t go deeply into what it is
So, no, the author claims it too.
Capabilities are a way to do access control where the client holds the key to access something, instead of the server holds a list of what is allowed based on the clients identities.
But when people use that word, they are usually talking about fine-grained access control. On a language level, that would mean not granting access for example for a library to do network connections, even though your program as a whole has that kind of access.
For example, consider a simple function to copy files. We could implement it like this:
def copy(fs: Filesystem, in: Path, out: Path) {
inH: HandleRead = fs.openRead(in);
outH: HandleWrite = fs.openWrite("/tmp/TEST_OUTPUT");
finished: Boolean = false;
while (!finished) {
match (inH.read()) {
case None: finished = true;
case Some(data) = outH.write(data);
}
}
inH.close();
outH.close();
}
However, there are many ways that things could go awry when writing code like this; e.g. it will write to the wrong file, since I forgot to put the real `out` value back after testing (oops!). Such problems are only possible because we've given this function the capability to call `fs.open` (in many languages the situation's even worse, since that capability is "ambient": available everywhere, without having to be passed in like `fs` above). There are also other capabilities/permissions/authorities implicit in this code, since any call to `fs.open` has to have the right permissions to read/write those files.
In contrast, consider this alternative implementation:
def copy(inH: HandleRead, outH: HandleWrite) {
finished: Boolean = false;
while (!finished) {
match (inH.read()) {
case None: finished = true;
case Some(data) = outH.write(data);
}
}
inH.close();
outH.close();
}
This version can't use the wrong files, since it doesn't have any access to the filesystem: there's literally nothing we could write here that would mean "open a file"; it's unrepresentable. This code also can't mix up the input/output, since only `inH` has a `.read()` method and only `outH` has a `.write()` method. The `fs.open` calls will still need to be made somewhere, but there's no reason to give our `copy` function that capability.
In fact, we can see the same thing on the CLI:
- The first version is like `cp oldPath newPath`. Here, the `cp` command needs access to the filesystem, it needs permission to open files, and we have to trust that it won't open the wrong files.
- The second version is like `cat < oldPath > newPath`. The `cat` command doesn't need any filesystem access or permissions, it just dumps data from stdin to stdout; and there's no way it can get them mixed up.
The fundamental idea is that trying to choose whether an action should be allowed or not (e.g. based on permissions) is too late. It's better if those who shouldn't be allowed to do an action, aren't even able to express it at all.
You're right that this can often involve "keys", but that's quite artificial: it's like adding extra arguments to each function, and limiting which code is scoped to see the values that need to be passed as those arguments (e.g. `fs.openRead(inPath, keyThatAllowsAccess)`), when we could have instead scoped our code to limit access to the functions themselves (though for HTTP APIs, everything is a URL; so "unguessable function endpoint URL" is essentially the same as "URL with secret key in it")
The key property being that everything can only be accessed via handles, including, recursively, other handles (i.e. to get an handle to an object you need first to already have an handle to the handle-giver for that object).
A capability is basically a reference which both designates some resource to be accessed and provides the authority to access it. The authority is not held somewhere else like an Access Control List - the reference is the authority. Capabilities must be unforgeable - they're obtained by delegation.
---
To give an example of where this has been used in a programming language, Kernel[1] uses a capability model for mutating environments. Every function (or operative) has an environment which holds all of its local variables, the environment is encapsulated and internally holds a reference to the parent environment (the surrounding scope). The rule is that we can only mutate the local variables of an environment to which we have a direct reference, but we cannot mutate variables in the parents. In order to mutate the variables in the parent, we must have a direct reference to the parent, but there is no mechanism in the language to extract the parent reference from the environment it is encapsulated in.
For example, consider the following trivial bit of code: We define some variable `x` with initial value "foo", we then mutate it to have the value "bar", then look up `x`.
($define! x "foo")
($set! (get-current-environment) x "bar")
x
As expected, this returns "bar". We have a direct reference to the local environment via `(get-current-environment)`.
Technically we could've just written `($define! x "bar")`, where the current environment is assumed, but I used `$set!` because we need it for the next example.
When we introduce a new scope, the story is different.
($define! x "foo")
($define! foo
($lambda ()
($set! (get-current-environment) x "bar")))
(foo)
x
Here we create a function foo, which has its own local environment, with the top-level environment as its parent. We can read `x` from inside this environment, but we can't mutate it. In fact, this code inserts a new variable `x` into the child environment which shadows the existing one within the scope of the function, but after `foo` has returned, this environment is lost, so the result of the computation is "foo". There is no way for the body of this lambda to mutate the top-level environment here because it doesn't have a direct reference to it.
So far basically the same static scoping rules you are used to, but environments in Kernel are first-class, so we can get a reference to the top-level environment which grants the child environment the authority to mutate the top level environment.
($define! x "foo")
($define! env (get-current-environment))
($define! foo
($lambda ()
($set! env x "bar")))
(foo)
x
And the result of this computation is "bar".
However, by binding `env` in the top-level environment, all child scopes can now have the ability to mutate the top-level.
To avoid polluting the environment in such way, the better way to write this is with an operative (as opposed to $lambda), which implicitly receives the caller's environment as an argument, which it binds to a variable in its local environment.
($define! x "foo")
($define! foo
(wrap ($vau () caller-env
($set! caller-env x "bar"))))
(foo)
x
Now `foo` specifically can mutate it's caller's local environment, but it can't mutate the variables of the caller of the caller, and we have not exposed this authority to all children of the top-level.
---
This is only a trivial example, but we can do much more clever things with environments in Kernel. We can construct new environments at runtime, and they can have multiple parents, ultimately forming a DAG, where environment lookup is a Depth-First-Search, but the references to the parent environments are encapsulated and cannot be accessed, so we cannot mutate parent scopes without a direct reference - we can only mutate the root node of the DAG for an environment to which we have a direct reference. The direct reference is a capability - it's both the means and the authority to mutate.
---
We can use these first-class environments in conjunction with things like `$remote-eval`, which evaluates some piece of code in an environment provided by the user, which may contain only the bindings they specify, and does not capture anything from the surrounding scope.
We get an error, `write` is unbound - even though `write` is available in the scope in which we performed this evaluation. We could catch this error with a guarded continuation so the program does not crash.
This combination of features basically let you create "mini sandboxes", or custom DSLs, with more limited capabilities than the context in which they're evaluated. Most languages only let you add new capabilities to the static environment, by defining new functions and types - but how many languages let you subtract capabilities, so that fewer features are available in a given context? Most languages do this purely at compile time via a module/import system, or with static access modifiers like `public` and `private`. Kernel lets you do this at runtime.
---
One thing missing from this example, which is required for true capabilities, is the ability to revoke the authority. The only way we could revoke the capability of a function to mutate an environment is to suspend the program.
Proper capabilities allow revocation at any time. If the creator of a capability revokes the authority, this should propagate to all duplicated, delegated, or derived capabilities with immediate effect. The capabilities that were held become "zombies", which no longer provide the means nor the authority - and this is why it is essential that we don't separate designation from authority, and why these should both be encapsulated in the capability.
This clearly makes it difficult to provide proper capabilities in programming languages, because we have to handle every possible error where we attempt to access a zombie capability. The use of such capabilities should be limited to where they really matter such as access to operating system resources, cryptographic keys, etc. where it's reasonable to implement robust error handling code. We don't want capabilities for every programming language feature because we would need to insert error checks on every expression to handle the potential zombie. Attempting to check if a capability is live before using it is no solution anyway, because you would have race conditions, so the correct approach to using them is to just try and catch the error if it occurs.
Another take-away from this is that if capabilities are provided in a language via a type system, it must be a dynamic type system. You cannot grant authority in a static type system at compile time if the capability may have already been revoked by the time the program is run. Capabilities are inherently dynamic by nature because they can be revoked at any time. This doesn't mean you can't use capabilities in conjunction with a static type system - only that the static type system can't really represent capabilities.
You can find out a lot more about them on the erights page that others have linked, and I would recommend looking into seL4 if you're interested in how they're applied to operating systems.
* In general, whenever I hear "compiler will optimize this", I die a little on the inside. Not even because it's delegating solution of the newly created problem to someone else, but because it creates a disconnect between what the language tells you is possible and what actually is possible. It encourages this kind of multi-layer lie that, in anger, you will have to untangle later, and will be cursing a lot, and will not like the language one bit.
* Capabilities. Back in the days when ActionScript 3 was relevant, there was a big problem of dynamic code sharing. Many services tried to implement module systems in AS3, but the security was not done well. To give you some examples: a gaming portal written in AS3 wants to load games written by programmers who aren't the portal programmers (and could be malicious, i.e. trying to steal data from other programs, or cause them to malfunction etc.) ActionScript (and by extension JavaScript 4) had a concept of namespaces borrowed from XML (so not like in C++), where availability of particular function was, beside other things, governed by whether the caller is allowed to access the namespace. There were some built-in namespaces, like "public", "private", "protected" and "internal" that functioned similar to Java's namesakes. But users were allowed to add any number of custom namespaces. These namespaces could be then shared through a function call in a public namespace. I.e. the caller would have to call the function and supply some kind of a password, and if password matched, the function would return the namespace object, and then the caller could use that namespace object to call the functions in that namespace. I tried to promote this concept in Flex Framework for dealing with module loading, but that never was seriously considered... Also, people universally hated XML namespaces (similar to how people seem to universally hate regular expressions). But, I still think that it could've worked...
* All this talk about "dynamic languages"... I really don't like it when someone creates a bogus category and then says something very general about it. That whole section has no real value.
* A Truly Relation Language -- You mean, like Prolog? I wish more relational databases exposed their content via Prolog(like) language in addition to SQL. I believe it's doable, but very few people seem to want it, and so it's not done.
“It feels like programming languages are stagnating.”
As they should be. Not every language needs to turn into C++, Rust, Java, C#, or Kotlin.
The only group I see lamenting about features these days are PL theorists, which is fine for research languages that HN loves but very few use outside the bubble.
Folks, this is not a process that converges. We've now had 60 years of language design, use and experience. We're not going to get to an ideal language because there are (often initially hidden) tradeoffs to be made. Everyone has a different idea of which side of each tradeoff should be taken. Perhaps in the future we can get AI to generate and subsequently re-generate code, thereby avoiding the need to worry too much about language design (AI doesn't care that it constantly copies/pastes or has to refactor all the time).
For "value database", it seems to me that the trick is, you can't just ship the executable. You have to ship the executable plus the stored values, together, as your installation image. Then what you run in production is what you tested in staging, which is what you debugged on your development system.
I mean, there still may be other things that make this a Bad Idea(TM). But I think this could get around the specific issue mentioned in the article.
If it's about well-contained applications in a well designed (and user-centric) OS with a proper concept of "application" and "installation", with a usable enough mechanism, I don't see anything that would make it bad.
On Windows it's a disaster. To the point that dumping random text files around in Linux works better.
- Looser functions (badly chosen name)
Timeouts on calls are, as the OP mentions, a thing in Erlang. Inter-process and inter-computer calls in QNX can optionally time out, and this includes all system calls that can block. Real-time programs use such features. Probably don't want it on more than that. It's like having exceptions raised in things you thought worked.
- Capabilities
They've been tried at the hardware level, and IBM used them in the System/38, but they never caught on. They're not really compatible with C's flat memory model, which is partly they fell out of fashion. Capabilities mean having multiple types of memory. Might come back if partially-shared multiprocessors make a comeback.
- Production-Level Releases
That's kind of vague. Semantic versioning is a related concept. It's more of a tooling thing than a language thing.
- Semi-Dynamic Language
I once proposed this for Python. The idea was that, at some point, the program made a call that told the system "Done initializing". After that point, you couldn't load more code, and some other things that inhibit optimization would be prohibited. At that point, the JIT compiler runs, once. No need for the horrors inside PyPy which deal with cleanup when someone patches one module from another.
Guido didn't like it.
- Value Database
The OP has a good criticism of why this is a bad idea. It's an old idea, mostly from LISP land, where early systems saved the whole LISP environment state. Source control? What's that?
- A Truly Relational Language
Well, in Python, almost everything is a key/value store. The NoSQL people were going in that direction. Then people remembered that you want atomic transactions to keep the database from turning to junk, and mostly backed off from NoSQL where the data matters long-term.
- A Language To Encourage Modular Monoliths
Hm. Needs further development. Yes, we still have trouble putting parts together. There's been real progress. Nobody has to keep rewriting Vol. I of Knuth algorithms in each new project any more. But what's being proposed here?
- Modular Linting
That's mostly a hack for when the original language design was botched. View this from the point of the maintenance programmer - what guarantees apply to this code? What's been prevented from happening? Rust has one linter, and you can add directives in the code which allow exceptions. This allows future maintenance programmers to see what is being allowed.
> Capabilities ... They're not really compatible with C's flat memory model ... Capabilities mean having multiple types of memory
C is not really dependent on a flat memory model - instead, it models memory allocations as separate "objects" (quite reniniscent of "object orientation in hardware" which is yet another name for capabilities), and a pointer to "object" A cannot be offset to point into some distinct "object" B.
> A Truly Relational Language
This is broadly speaking how PROLOG and other logic-programming languages work. The foundational operation in such languages is a knowledge-base query, and "relations" are the unifying concept as opposed to functions with predefined inputs and outputs.
> In general, while I can’t control how people react to this list, should this end up on, say, Hacker News, I’m looking more for replies of the form “that’s interesting and it makes me think of this other interesting idea” and less “that’s stupid and could never work because X, Y, and Z so everyone stop talking about new ideas” or “why hasn’t jerf heard of this other obscure language that tried that 30 years ago”. (Because, again, of course I don’t know everything that has been tried.)
OK, a few things that languages ought to have:
- Everything except C now has standard strings, not just arrays of characters. Almost all languages now have some standard way to do key/value sets. What else ought to be standard?
-- Arrays of more than one dimension would be helpful for numerical work. Most languages descended from C lack this. They only have arrays of arrays. Even Rust lacks it. Proposals run into bikeshedding - some people want rectangular slices out of arrays, which means carrying stride info around.
-- Standard types for 2, 3 and 4-element vectors would help in graphics work. There are too many different implementations of those in most language and too much conversion.
Things to think about:
- Rust's ownership restrictions are harsh. Can we keep the safety and do more?
-- The back-reference problem needs to be solved somehow. Back references can be done with Rc and Weak, but it's clunky.
-- Can what Rust does with Rc, RefCell, and .borrow() be checked at compile time? That allows eliminating the run-time check, and provides assurance that the run-time check won't fail. Something has to look at the entire call tree at compile time, and sometimes it won't be possible to verify this at compile time. But most of the time, it should be.
-- There's a scheme for ownership where there's one owning reference and N using references. The idea is to verify at compile time that the using references cannot outlive the owning one. Then there's no need for reference counds.
-- Can this be extended to the multi-thread case? There have been academic demos of static deadlock detection, but that doesn't seem to have made it into production languages.
-- A common idiom involves things being owned by handles, but also indexed for lookup by various keys. Dropping the handle drops the object and removes it from the indices. Is that a useful general purpose operation? It's one that gets botched rather often.
-- Should compilers have SAT-solver level proof systems built in?
-- Do programs really have to be in monospaced fonts? (Mesa on the Alto used the Bravo word processor as its text editor. Nobody does that any more.)
-- There's async, there are threads, and there are "green threads", such as Go's "goroutines". Where's that going?
-- Can we have programs which run partly in a CPU and partly in a GPU, compiled together with the appropriate consistency checks, so the data structures and calls must match to compile?
-- How about "big objects?" These are separately built program components which have internal state and some protection from their callers. Microsoft OLE did that, some .dll files do that, and Intel used to have rings of protection and call gates to help with that, hardware features nobody used. But languages never directly supported such objects.
So there are a few simple ideas to think about.
> - Rust's ownership restrictions are harsh. Can we keep the safety and do more?
https://www.languagesforsyste.ms/publication/fearless-concur...
> -- Should compilers have SAT-solver level proof systems built in?
They already do. The exhaustivity checker in Rust (and functional languages) is equivalent in power to SAT.
C has arrays with more than one dimension. (but no direct support for strides). Of course, it is just the same thing as arrays of arrays.
> Do programs really have to be in monospaced fonts?
Of course not. I've been using a proportional font for at least 10 years and I'm still in business working on code bases shared with developers using monospaced fonts. Both work, none disturb the other, proportional is easier to read as any book can demonstrate. Alignment doesn't matter much.
> But what's being proposed here?
He proposes that there is a need for a way to connect modules, i.e. dependency injection, without the modules having explicit knowledge of each other, with compile-time verification that the modules being connected are compatible, without the interface song and dance.
Commercial image based systems have had source control like management for decades.
If anything many "modern" low code SaaS products are much worse in this regard, than what Lisp and Smalltalk have been offering for years.
> Source control? What's that?
I think squeak had Monticello for source control with their image based approach almost 20+ years ago and there was something else for smalltalk in the '80s too.
But yeah people like text and hate images, and I believe Pharo switched back to some git integration.
Smalltalk implementations have had text export/import for ages, and image based source control as you point out, is also quite old, Monticello wasn't the first.
In some way, Smalltalk had/has a more advanced and semantic diffing called ChangeSet
I'm surprised these are called "programming language ideas". They seem to be solvable, at least many of them, with libraries. For example, my Haskell effect system Bluefin can be seen as a capability system for Haskell. My database library Opaleye is basically a relational query language for Haskell. Maybe I'm short-sighted but I haven't seen the need for a whole new language to support any of that functionality. In fact one gets huge benefits from implementing such things in an existing language.
* https://hackage.haskell.org/package/bluefin
* https://hackage.haskell.org/package/opaleye
One advantage (which is touched on in the logging section) is that having it provided by the language makes it clear what the default is, and sets expectations. Essentially, lifting it into the language is a way of coordinating the community.
Thanks for sharing your thoughts.
I am also agreeing that relational approach to in-memory data is a good, efffective thought.
I recently compiled some of my C code with the sqlite database and I'm preparing to think how the SQL model of my standard code could be used as the actual implementation language of in memory operations.
Instead of writing the hundredth loop through objects I just write a SQL query instead with joining with seeing the internal data representation of the software as an information system instead of bespoke code.
I was hoping to make it possible to handle batches of data and add parallelism because arrays are useful when you want to parallelise.
I was thinking, wouldn't it be good if you could write your SQL queries in advance of the software and then parse them and then compile them to C code (using an unrolled loop of the SQLite VM) so they're performant. (For example, instead of a btree for a regular system operation, you can just use a materialised array a bit like a filesystem so you're not rejoining the same data all the time)
I was thinking of ways of representing actors somehow communicating by tables but I do not have anything concrete for that.
Using the same idea there is https://datadraw.sourceforge.net/ and https://github.com/google/rune using it.
DataDraw is an ultra-fast persistent database for high performance programs written in C. It's so fast that many programs keep all their data in a DataDraw database, even while being manipulated in inner loops of compute intensive applications. Unlike slow SQL databases, DataDraw databases are compiled, and directly link into your C programs. DataDraw databases are resident in memory, making data manipulation even faster than if they were stored in native C data structures (really). Further, they can automatically support infinite undo/redo, greatly simplifying many applications.
I agree about relational languages. It's absurd when I think that SQL and Datalog came from the same foundations of relational calculus. It's just so much lost expressive power.
I really like what PRQL [1] did, at least it makes table operations easily chainable. Another one that comes to mind is Datomic [2].
[1]: https://prql-lang.org/
[2]: https://docs.datomic.com/peer-tutorial/query-the-data.html
I explore the idea with https://tablam.org (relational + array). I even toyed with making relational queries to make types:
So the types are in full sync when changes on the schema happen. And all of this is type safe.I was struggling with doing interesting things with the semantic web circa 2007 and was thinking "OWL sucks" and looking at Datalog as an alternative. At that time Datalog was an obscure topic and hard to find information about it. 10 years later it was big.
(Funny after years of searching I found somebody who taught me how to do really complex modelling in OWL DL but from reading the literature I'm pretty sure the average PhD or prof in the field has no idea.)
> I found somebody who taught me how to do really complex modelling in OWL DL
Is there any resource you could recommend for that?
I wrote up what I learned an a technical report that got sent to the editors at ISO a month or so ago and ought to appear pretty soon. Look up my profile and send me a note.
Uh... given the beauty of relational algebra I don't understand how we ended up with the ugly mess of sql.
This makes me want to throw up
https://www.postgresql.org/docs/current/queries-with.html
If you would like some exposure therapy: https://databasearchitects.blogspot.com/2024/12/advent-of-co... [0]
[0] Recent discussion https://news.ycombinator.com/item?id=42577736
Some people might think it is crazy but I like wrapping queries like that up in JooQ so I can write
Life finds a way, I suppose:-)
You might be interesting in looking at the Lima programming language: http://btetrud.com/Lima/Lima-Documentation.html . It has ideas that cover some of these things. For example, it's intended to operate with fully automatic optimization. This assumption allows shedding lots of complexity that arises from needing to do the same logical thing in multiple ways that differ in their physical efficiency characteristics. Like instead of having 1000 different tree classes, you have 1 and optimisers can then look at your code and decide what available tree structures make most sense in each place. Related to your async functions idea, it does provide some convenient ways of handling these things. While functions are just normal functions, it has a very easy way to make a block of async (using "thread") and provides means of capturing async errors that result from that.
For semi-dynamic language, Julia definitely took the approach of being a dynamic language that can be (and is) JITed to excellent machine code. I personally have some larger projects that do a lot of staged programming and even runtime compilation of user-provided logic using Julia. Obviously the JIT is slower to complete than running a bit of Lua or whatever, but the speed after that is phenomenal and there’s no overhead when you run the same code a second time. It’s pretty great and I’d love to see more of that ability in other languages!
Some of the other points resonate with me. I think sensible dynamic scoping would be an easy way to do dependency injection. Together with something like linear types you could do capabilities pretty smoothly, I think. No real reason why you couldn’t experiment with some persistent storage as one of these dependencies, either. Together with a good JIT story would make for a good, modular environment.
Oh and Zig is another option for allowing injections that are checked when used at a call site rather than predefined through interfaces.
AFAIK it doesn’t have closures (it’s too C-like) so you need to use methods for all your (implicit) interfaces, but that’s okay…
I think the “exemplars” could be automatically yoinked from documentation and tests and existing usage of the function in the code base. Work needs to be done on the IDE front to make this accessible to the user.
> Value Database
> Smalltalk and another esoteric programming environment I used for a while called Frontier had an idea of a persistent data store environment. Basically, you could set global.x = 1, shut your program down, and start it up again, and it would still be there.
Frontier! I played with that way back when on the Mac. Fun times.
But as for programming language with integrated database... MUMPS! Basically a whole language and environment (and, in the beginning, operating system) built around a built-in global database. Any variable name prefixed with ^ is global and persistent, with a sparse multi-dimensional array structure to be able to organize and access the variables (e.g. ^PEOPLE(45,"firstname") could be "Matthew" for the first name of person ID 45). Lives on today in a commercial implementation from Intersystems, and a couple Free Software implementations (Reference Standard M, GT.M, and the GT.M fork YottaDB). The seamless global storage is really nice, but the language itself is truly awful.
TADS, an OOP language + VM for interactive fiction, has this "value database" model. Once loaded into memory, the compiled image can be updated with values stored in a separate save file. The compiled image itself could store updated values as well.
In fact, it does this during a "preinit" stage that runs immediately after compilation. Once all preinit code finishes executing, the compiled image is overwritten with the updated state. The language includes a "transient" keyword to permit creating objects that should not be stored.
This same mechanism permits in-memory snapshots, which are used for the game's UNDO feature. No need to rewind or memento-ize operations, just return to a previous state.
It's not a general-purpose mechanism. After all, the language is for building games with multiple player-chosen save files, and to permit restarting the game from a known Turn 0 state.
Image persistence was one of the cool ideas of Smalltalk. And in practice, one of the biggest drawbacks. Cruft and old values accumulated steadily, with very little way to find and eliminate them. Transient execution has some cons. But on the pro side, every run starts from a "clean slate."
> with very little way to find and eliminate them.
The best Smalltalk these days is GlamorousToolkit: https://gtoolkit.com/
It has a sort of git in it, so you can easily "rollback" your image to previous states. So going back and forth in history is trivial.
> you can easily "rollback" your image to previous states.
Sounds very interesting. Does it support multi-developer merging and/or rebasing of changes?
I believe it's just a git repo behind the scenes. Not sure if the UI exposes those things as I never used that in multi-developer scenarios! Give it a go and see.
This may fall in the "you think you do, but you don't category", but I've always wanted a Smalltalk (or similar, not that picky) with a persistent virtual memory.
That is, the VM is mapped to a backing file, changes persisted automatically, no "saving", limited by drive space (which, nowadays, is a lot). But nowadays we also have vast memory space to act as a page cache and working memory.
My contrived fantasy use case was having a simple array name "mail", which an array containing all of my email messages (in email object, of course). Naturally as you get more mail, the array gets longer. Also, as you delete mail, then the array shifts. It's no different, roughly, than the classic mbox format, save it's not just text, its objects.
You can see if you delete a email, from a large (several GBs), there would be a lot of churn. That implies maybe it's not a great idea to use that data structure, but that's not the point. You CAN use that data structure if you like (just like you can use mbox if you like).
Were it to be indexed, that would be done with parallel data structures (trees or hashes or whatever).
But this is all done automagically. Just tweaks to pages in working memory backed by the disk using the virtual memory manager. Lots and lot of potential swapping. C'est la vie, no different from anything else. This what happens when you map 4TB into a 16GB work space.
The problem with such a system, is how fragile is potentially is. Corrupt something and it happily persists that corruption, wrecking the system. You can't reboot to fix it.
Smalltalk suffers from that today. Corrupt the image (oops, did I delete the Object become: method again?), and its gone for good. This is mitigated by having backup images, and the changelist to try to bring you back to the brink but no further.
I'm guessing a way to do that in this system is to use a copy on write facility. Essentially, snapshot the persistent store on each boot (or whatever), and present a list of previous snapshot at start up.
Given the structure of a ST VM you'd like to think this is not that dreadful to work up. I'd like to think a paper napkin implementation PoC would be possible, just to see what it's like. One of those things were the performance isn't really that great, but the modern systems are so fast, we don't really notice it in human terms.
But I do think it would be interesting.
Have you looked at Pharo? Their git integration makes it relatively easy to export and backup parts of your main image, and to pull the things back into a fresher one once you mess up.
The MUMPS database is wild. When I was working in MUMPS, it was so easy and fun to whip up an internal tool to share with my coworkers. You don't have to give any special thought at all to persistence, so you're able to stay in the flow of thinking about your business logic.
But as you said, the language itself is almost unbearable to use.
I had a professor who is responsible for a lot of the more "modern" MUMPS stuff (lets be real, MUMPS is OLD!). Guy was pretty unbearable too.
I relate to this post so so much. https://jerf.org/iri/post/2025/programming_language_ideas/#v...
To me , this idea seems so so insane (especially for things like extraction , like you start extracting a zip on one device and it can be partially extracted and then you can partially extract it on the other) (yes sure , you could loop over each file and have a list of files currently unzipped and rather unzip the file which hasn't been unziped yet)
But Imagine if the file to be extracted is a singular file in zip (like 100 gig file)
I don't know , I have played this with criu and it had worked. Qemu can also work. But this idea is cool
Instead of using a default storage where entropy can hit , I would personally like it if the values were actually stored in sqlite and combined with Truly Relational Language maybe as well (but it doesn't truly require you to learn sqlite)
I had posted this on one of hackernews this as well and theoretically its possible with the brainfu* in sqlite intepreter that I had found. But I don't know.... If anybody knows of a new language / a method for integrating this in new languages , it would be pretty nice.
Oh my god , Another banger is the modular monolith part which I personally believe that it can be considered that java / kotlin ecosystem , golang with nats , elixir / erlang can be.
Another cool way is using encore in golang or typescript and then hosting the aws stack yourself or running encore locally I am not sure)
There is also sst framework which can allow to be run on docker and also https://github.com/vercel/fun
I think the coloured function problem boils down to the fact that async functions are not naturally a specific kind of sync function, but the other way around.
Functions are so ubiquitous we forget what they really are: a type of guarantee about the conditions under which the code within will run. Those guarantees include the availability of arguments and a place to put the return value (on the stack).
One of the key guarantees about sync functions is the call structure: one thread of execution will be in one function and one function only at any point during the program; the function will only be exited on return (or exception, or panic) or call of another function; and all the local data will be available only for the duration of that function call.
From that perspective, async functions are a _weakening_ of the procedural paradigm where it is possible to "leave behind" an instruction pointer and stack frame to be picked up again later. The ability to suspend execution isn't an additional feature, it's a missing guarantee: a generalisation.
There is always an interplay between expressiveness and guarantees in programming languages. Sometimes, it is worth removing a guarantee to create greater expressiveness. This is just an example of that.
I mentioned exceptions earlier — it's no wonder that exceptions and async both get naturally modelled in the same way (be it with monads or algebraic effects or whatever). They are both examples of weakening of procedural guarantees. Exceptions weaken the guarantee that control flow won't exit a function until it returns.
I think the practical ramifications of this are that languages that want async should be thinking about synchronous functions as a special case of suspendable functions — specifically the ones that don't suspend.
As a counterpoint, I can imagine a lot of implementation complexities. Hardware is geared towards the classical procedural paradigm, which provides an implementation foundation for synchronous procedures. The lack of that for async can partially explain why language authors often don't provide a single async runtime, but have this filled in by libraries (I'm thinking of Rust and Kotlin here).
Interesting that E is cited under “capabilities”, but not under “loosen up the functions”. E’s eventual-send RPC model is interesting in a number of ways. If the receiver is local then it works a bit like a JavaScript callback in that there’s an event loop driving execution; if it’s remote then E has a clever “promise pipelining” mechanism that can hide latency. However E didn’t do anything memorable (to me at least!) about handling failure, which was the main point of that heading.
For “capabilities” and “A Language To Encourage Modular Monoliths”, I like the idea of a capability-secure module system. Something like ML’s signatures and functors, but modules can’t import, they only get access to the arguments passed into a functor. Everything is dependency injection. The build system determines which modules are compiled with which dependencies (which functors are passed which arguments).
An existing “semi-dynamic language” is CLOS, the Common Lisp object system. Its metaobject protocol is designed so that there are clear points when defining or altering parts of the object system (classes, methods, etc.) at which the result is compiled, so you know when you pay for being dynamic. It’s an interesting pre-Self design that doesn’t rely on JITs.
WRT “value database”, a friend of mine used to work for a company that had a Lisp-ish image-based geospatial language. They were trying to modernise its foundations by porting to the JVM. He had horror stories about their language’s golden image having primitives whose implementation didn’t correspond to the source, because of decades of mutate-in-place development.
The most common example of the “value database” or image-based style of development is in fact your bog standard SQL database: DDL and stored procedures are very much mutate-in-place development. We avoid the downsides by carefully managing migrations, and most people prefer not to put lots of cleverness into the database. The impedance mismatch between database development by mutate-in-place and non-database development by rebuild and restart is a horribly longstanding problem.
As for “a truly relational language”, at least part of what they want is R style data frames.
Interesting points!
We're working on a language with some of these ideas:
https://www.firefly-lang.org/
Object capabilities, async calls as easy as sync calls, modular monoliths, and (eventually) unified logging.
None of the relational language features though.
Feedback appreciated!
My wild idea is that I'd like to see a modern "high-level assembler" language that doesn't have a callstack. Just like in the olden days, all functions statically allocate enough space for their locals. Then, combine this with some semi-convenient facility for making sure that local variables for a given function always fit into registers; yes, I admit that I'm strange when I say that I dream of a language that forces me to do manual register allocation. :P But mostly what I want to explore is if it's possible to create a ""modern"" structured programming language that maps cleanly to assembly, and that provides no optimization backend at all, but has enough mechanical sympathy that it still winds up fast enough to be usable.
> all functions statically allocate enough space for their locals.
Would you still have distinct activation records per call or forfeit the ability to have reentrant functions and recursion?
That's one of the main reasons to move to dynamic (as in a call stack) allocation of your activation records versus a single static allocation per function.
In this hypothetical language I'm assuming that recursion is unsupported and that if threading is supported at all, then each thread has its own copy of every function's locals (or at least every function that can be called concurrently; structured concurrency might be leveraged to prove that some functions don't need to be reentrant, or maybe you just chuck a mutex in each function prologue and YOLO). However, while enforcing that basic recursion is forbidden isn't too difficult (you make the language statically-typed, all names lexically-scoped, and don't support forward declarations), it does probably(?) mean that you also lose first-class functions and function pointers, although I haven't thought deeply about that.
Have you thought about what happens if you want to read and parse a file? Do you declare the maximum filesize you want to support and statically allocate that much memory?
I'm not intending to imply that the language I'm describing can't support heap-allocated memory; Rust shows us that it's even possible to do so without having to manually deallocate, if you're okay with a single-ownership discipline (which is a rather simple analysis to implement, as long as you don't also want a borrow checker along for the ride). Instead, this is about trying to make a language that makes it easy to keep locals in registers/cache, rather than relying on the compiler backed to do register allocation and hoping that your CPU can handle all that cache you're thrashing.
No, you have a scoped pointer to dynamically allocated memory; when the scoped pointer is destroyed/cleaned up/released at the end of the function, it releases the allocated memory.
A useful purpose for such a thing is in certain embedded, hard-real-time, or mission-critical scenarios.
Many such programming environments need strict control over stack sizes to avoid any possibility of stack overflow.
I had a similar notion a few years back, thinking about a somewhat wider range of "scoped guarantees". The compiler would compute things such as the maximum stack usage of a function, and this would "roll up" to call sites automatically. This could also be used to enforce non-usage of certain dangerous features such as locks, global flags, or whatever.
Why would you like to have this language? Is it about control over the execution? About better ways to personally optimize? Or just intellectual pleasure? Or is it about reliving the olden days of assembly language programming but with a modern conveniences?
I would simply find pleasure in being able to understand basically every level of the stack. For a RISC architecture, it's not too hard to get a grasp on how it works. Likewise for a simple-enough programming language. The problem(?) is that in between these two is an opaque black box--the optimization backend, which I feel I have no hope of understanding. So instead I wonder if it's possible to have a "safe" (safer than C) and "high-level" (more abstractive than C) language that is still useful and semi-performant, and I'm wondering how much ergonomics would need to be sacrificed to get there. It's a thought experiment.
We have built something that hits on points 1, 3, 5, and 7 at https://reboot.dev/ ... but in a multi-language framework (supporting Python and TypeScript to start).
The end result is something that looks a lot like distributed, persistent, transactional memory. Rather than explicit interactions with a database, local variable writes to your state are transactionally persisted if a method call succeeds, even across process/machine boundaries. And that benefits point 7, because transactional method calls compose across team/application boundaries.
[1] Loosen Up The Functions [3] Production-Level Releases [5] Value Database [7] A Language To Encourage Modular Monoliths
This seems to be similar to Azure Durable Functions:
https://learn.microsoft.com/en-us/azure/azure-functions/dura...
They are related, for sure. But one of the biggest differences is that operations affecting multiple Reboot states are transactional, unlike Azure's "entity functions".
Because multiple Azure entity functions are not updated transactionally, you are essentially always implementing the saga pattern: you have to worry about cleaning up after yourself in case of failure.
In Reboot, transactional function calls automatically roll back all state changes if they fail, without any extra boilerplate code. Our hypothesis is that that enables a large portion of an application to skip worrying about failure entirely.
Code that has side-effects impacting the outside world can be isolated using our workflow mechanism (effectively durable execution), which can themselves be encapsulated inside of libraries and composed. But we don't think that that is the default mode that developers should be operating in.
Starlark, a variant of Python, can be thought of as semi dynamic: all mutation in each file happens once, single threaded, and then that file and all its data structures are frozen so downstream files can use it in parallel
A lot of "staged" programs can be thought of as semi dynamic as well, even things like C++ template expansion or Zig comptime: run some logic up front, freeze it, then run the rest of the application later
An interesting problem I've played around with fair bit is the idea of a maximally expressable non-Turing complete language, trying to make a language that is at least somewhat comfortable to use for many tasks, while still being able to make static assertions about runtime behavior.
The best I've managed is a functional language that allows for map, filter, and reduce, but forbids recursion or any other looping or infinite expansion in usercode.
The pitch is that this kind of language could be useful in contexts where you're executing arbitrary code provided by a potentially malicious third party.
Non-Turing-completeness doesn’t buy you that much, because you can still easily multiply runtime such that it wouldn’t terminate within your lifetime. With just map you can effectively build the cross product of a list with itself. Do that in an n-times nested expression (or nested, non-recursive function calls), and for a list of length k the result is a list of length kⁿ. And with reduce you could then concatenate a string with itself those kⁿ times, resulting in a string (and likely runtime and memory usage) of length 2^kⁿ.
If you want to limit the runtime, you need to apply a timeout.
I think you're asking for Starlark (https://starlark-lang.org), a language that strongly resembles Python but isn't Turing-complete, originally designed at Google for use in their build system. There's also Dhall (https://dhall-lang.org), which targets configuration use cases; I'm less familiar with it.
One problem is that, while non-Turing-completeness can be helpful for maintainability, it's not really sufficient for security. Starlark programs can still consume exponential amounts of time and memory, so if you run an adversary's Starlark program without sandboxing it, you're just as vulnerable to denial-of-service attacks as you'd be with a Turing-complete language. The most common solution is sandboxing, wherein you terminate the program if it exceeds time or memory limits; however, once you have that, it's no longer necessary for the language to not be Turing-complete, so you might as well use a popular mainstream language that's easy to sandbox, like JavaScript.
One other intriguing option in the space is CEL (https://cel.dev), also designed at Google. This targets use cases like policy engines where programs are typically small, but need to be evaluated frequently in contexts where performance matters. CEL goes beyond non-Turing-completeness, and makes it possible to statically verify that a program's time and space complexity are within certain bounds. This, combined with the lack of I/O facilities, makes it safe to run an adversary's CEL program outside a sandbox.
If you're interested in prior art, Ian Currie's NewSpeak was an attempt at a non-Turing complete language for safety critical systems. Most of the search results are for a different language with the same name, but "RSRE currie newspeak" should find relevant links.
Could be a good idea for a multiplayer ingame scripting language.
Well OP, are you me? everything you listed is also in my short wishlist for a programming language (well except for the value database, once to you have first class relational tables in your language, persistence can be tied to the table identity, doesn't need to be implicit).
Capabilities and dynamic scoping for "modularisation" nicely lead to implicit variables instead of truly global dynamically scoped variables. Implicit variables also probably work well to implement effect systems which means well behaved asyncs.
Edit: other features I want:
- easy embedding in other low level languages (c++ specifically)
- conversely, easy to embed functions written in another language (again c++).
- powerful, shell-like, process control system (including process trees and pipelines), including across machines.
- built-in cooperative shared memory concurrency, and preemptive shared nothing.
- built-in distributed content addressed store
I guess I want Erlang :)
I'll throw another idea here I've been thinking from a time now.
Most languages have a while construct and a do-while.
The while is run as And the do-while switches the order: The issue with the while is that more often than not you need to do some preparations before the condition. So you need to move that to a function, or duplicate it before and inside the loop. Do-while doesn't help, since with that you can't do anything after the condition. The alternative is a while(true) with a condition in the middle. But what if there was a language construct for this? Something like Is there a language that implements this somehow? (I'm sure there is, but I know no one)The best thing is that this construct can be optimized in assembly perfectly:
I suspect you'll love Common Lisp's LOOP: https://gigamonkeys.com/book/loop-for-black-belts
Example:
You can insert a condition check in the middle, of course: And much, much more. It's the ultimate loop construct.Don't forget about the `prog*` family.
---
Evaluates foo, then bar(s), and returns the result of evaluating foo and discards the results of bar(s).Useful if `foo` is the condition and you need to perform some change to it immediately after, eg:
--- Evaluates foo, then bar, then baz(s) (if present), returns the result of evaluating bar and discards the results of evaluating foo and baz(s).Might be what GP wants. `foo` is the preparation, `bar` is the condition`, and `baz` can be some post-condition mutation on the compared value. Not too dissimilar to
With `prog2` you could achieve similar behavior with no built in `for`: --- Evaluate each foo in order, return the result of evaluating the last element of foo and discard all the others.`progn` is similar to repeated uses of the comma operator in C, which GP has possibly overlooked as one solution.
actually, according to the LOOP syntax, the REPEAT clause has to follow the FOR clause...
Just a silly example, but it does work on SBCL at least.
A bunch of things work in implementations, while but are not standard conforming.
Lambda The Ultimate Loop!
In some way it's the dual of break, in that you want to jump into the middle of the loop, while break is to jump out of it.
Let's rewrite the loop this way, with 'break' expanded to 'goto':
The dual would be: Both constructs need two points: where the jump begins and where it lands. The 'break' is syntactic sugar that removes the need to specify the label 'exitpoint'. In fact with 'break' the starting point is explicit, it's where the 'break' is, and the landing point is implicit, after the closing '}'.If we want to add the same kind of syntactic sugar for the jump-in case, the landing point must be explicit (no way for the compiler to guess it), so the only one we can make implicit is the starting point, that is where the 'do' is.
So we need: a new statement, let's call it 'entry', that is the dual of 'break' and a new semantic of 'do' to not start the loop at the opening '{' but at 'entry'.
Is it more readable than today's syntax? I don't know...Ada has had something similar and very flexible since from the 80s ... like:
There's not that much new under the prog lang sun :(C-style for-loop is kinda sorta this. Although the "prepare" part has to be an expression rather than a statement, given that you have the comma operator and ?: you can do a lot there even in C. In C++, you can always stick a statement in expression context by using a lambda. So:
However, the most interesting take on loops that I've seen is in Sather, where they are implemented on top of what are, essentially, coroutines, with some special facilities that make it possible to exactly replicate the semantics of the usual `while`, `break` etc in this way: https://www.gnu.org/software/sather/docs-1.2/tutorial/iterat...Well, I am in a process of making a language where general loops will look like
I also think you'd enjoy Knuth's article "Structured Programming with go to Statements" [0]. It's the article that gave us the "premature optimization is the root of all evil" quote but it's probably the least interesting part of it. Go read it, it has a several sections that discuss looping constructs and possible ways to express it.[0] https://pic.plover.com/knuth-GOTO.pdf
I think we have a similar way of thinking. I once wrote a blog post about a for loop extension (based on Golang for illustration) [0].
[0] https://lukas-prokop.at/articles/2024-04-24-for-loop-extensi...Interesting idea, but once you add scoping:
Can the second block of the do-while see `value` in its lexical scope? If yes, you have this weird double brace scope thing. And if no, most non-trivial uses will be forced to fall back to `if (...) break;` anyway, and that's already clear enough imo.The scope should be unique, yes. In your example value should be visible.
Your are right about the word double braces, but I can't think of an alternate syntax other than just removing the braces around the while. But in that case it may seem odd to have a keyword that can only be used inside a specific block...wich is basically a macro for a if(.)break; Maybe I'm too used to the c/java syntax, maybe with a different way of defining blocks?
>Can the second block of the do-while see `value` in its lexical scope? If yes, you have this weird double brace scope thing
As long as it's documented and expected, it's not weird.
The scope then is the whole "do-while" statement, not the brace.
That seems more like a programmer expectations issue than something fundamental. Essentially, you have "do (call some function that returns a chunk of state) while (predicate that evaluates the state) ..."
Hard to express without primitives to indicate that, maybe.
Let me FTFY:
Eiffel has the loop structure.
from <initialization statements> until <termination condition> loop <group of statements> end
You mean like a shell's while-do-done? It's just about allowing statements as the conditions, rather than just a single expression. Here's an example from a repl I wrote:
The `printf` is your `prepare`.This should also be doable in languages where statements are expressions, like Ruby, Lisp, etc.
Here's a similar Ruby repl:
Exactly, here you are basically keeping it as a while with a condition but allowing it to be any code that at the end returns a boolean, although you need to make sure that variables defined in that block can be used in the do part.
Sidenote: I wasn't aware that shell allows for multiple lines, good to know!
I don't write a lot of while loops so this is just a bit unfamiliar to me, but I'm not really understanding how this isn't the same as `do{block}while(condition);`? Could you give a simple example of what kind of work `prepare` is doing?
Think of a producer (a method that returns data each time you request one, like reading a file line by line or extracting the top of a queue for example) that you need to parse and process until you find a special element that means "stop".
Something like
I'm aware this example can be trivially replaced with a while(data=parse(producer.get())){process(data)} but you are forced to have a method, and if you need both the raw and parsed data at the same time, either you mix them into a wrapper or you need to somehow transfer two variables at the same time from the parse>condition>processA do-while here also has the same issue, but in this case after you check the condition you can't do any processing afterwards (unless you move the check and process into a single check_and_process method...which you can totally do but again the idea is to not require it)
Not quite the same but almost feels like the BEGIN block in awk.
The C language construct for that is 'goto'.
Yeah, but the parent wants a non brain-damaged too-general construct
What if loops are a design mistake?
Look at C: it has 5 loop-related keywords (4.5; 'break' is two-timing in 'switch') yet it is still not enough.
When we are on subject of loops... I'd love to have 'else' block for loops that runs when the loop had zero iterations.
Not the same thing (although I thought it was), according to the Python docs, but related:
https://docs.python.org/3/reference/compound_stmts.html
See sections 8.3, the for statement, and 8.2, the while statement.
Yeah, `while...else` in Python does the wrong thing. Executes `else` block when the loop finished normally (not through `break`).
Scala for example has a `breakable {}` block that lets you indicate where you should land after a `break`
However I have no idea how to implement the kind of `else` I described in any language without checking the condition twice.PowerShell can process 0..n input objects from the pipeline using BEGIN {...} PROCESS {...} END {...} blocks.
I find this so incredibly useful, that I miss it from other languages.
Something related that I've noticed with OO languages such as Java is that it tends to result in "ceremony" getting repeated n-times for processing n objects. a well-designed begin-process-end syntax for function calls over iterables would be amazing. This could apply to DB connection creation, security access checks, logging, etc...
In Scala you can do:
This runs all 3 in order every iteration but quits if condition evaluates to false. It just uses the fact that value of a block is the value of the last expression in the block.Scala has a lot of syntax goodies although some stuff is exotic. For example to have a 'break' you need to import it and indicate where from exactly you want to break out of.
At what point are you just doing async and coroutines?
> What about a language where for any given bit of code, the dynamicness is only a phase of compilation?
This is (essentially) Crystal lang's type system. You end up with semantic analysis/compilation taking a significant amount of time, longer than other comparable languages, and using a lot of resources to do so.
I like a lot of these ideas.
"Semi-dynamic" is one of the most common architectures there is for large & complex systems. AAA games are usually written in a combination of C++ and a scripting language. GNU Emacs is a Lisp application with a custom interpreter that is optimized for writing a text editor. Python + C is a popular choice as well as Java + Groovy or Clojure, I've even worked with a Lua + FORTRAN system.
I also think "parsers suck". It should be a few hundred lines at most, including the POM file, to add an "unless" statement to the Java compiler. You need to (1) generate a grammar which references the base grammar and adds a single production, (2) create a class in the AST that represents the "unless" statement and (3) add an transformation that rewrites
You should be able to mash up a SQL grammar and the Java grammar so you can write this system should be able to export a grammar to your IDE. Most parser generators are terribly unergonomic (cue the event-driven interface of yacc) and not accessible to people who don't have a CS education (if you need a bunch of classes to represent your AST shouldn't these get generated from your grammar?) When you generate a parser you should get an unparser. Concrete syntax trees are an obscure data structure but were used in obscure RAD tools in the 1990s that would let you modify code visually and make the kind of patch that a professional programmer would write.The counter to this you hear is that compile time is paramount and there's a great case for that in large code bases. (I had a system with a 40 minute build) Yet there's also a case that people do a lot of scripty programming and trading compile time for a ergonomics can be a win (see Perl and REBOL)
I think one goal in programming languages is to bury Lisp the way Marc Anthony buried Caesar. Metaprogramming would be a lot more mainstream if it was combined with Chomksy-based grammars, supported static typing, worked with your IDE and all that. Graham's On Lisp is a brilliant book (read it!) that left me disappointed in the end because he avoids anything involving deep tree transformations or compiler theory: people do much more advanced transformations to Java bytecodes. It might be easier to write those kind of transformations if you had an AST comprised of Java objects instead of the anarchy of nameless tuples.+
I love these ideas! I've been thinking about the "fully relational" language ever since I worked with some product folks and marketers at my start up 15 years ago who "couldn't code" but were wizards at cooking up SQL queries to answer questions about what was going on with our users and product. There was a language written in rust, Tablam[0] that I followed for a while, which seemed to espouse those ideas, but it seems like it's not being owrked on anymore. And Jamie from Scattered Thoughts[1] has posted some interesting articles in that direction as well. He used to work on the old YC-company/product LightTable or Eve or something, which was in the same space.
I've also always thought Joe Armstrong's (RIP) thought of "why do we need modules" is really interesting, too. There's a language I've seen posted on HN here a couple times that seems to go in that approach, with functions named by their normalized hash contents, and referred to anywhere by that, but I can't seem to remember what it's called right now. Something like "Universe" I think?
[0] https://github.com/Tablam/TablaM [1] https://www.scattered-thoughts.net [2] https://erlang.org/pipermail/erlang-questions/2011-May/05876...
> with functions named by their normalized hash contents, and referred to anywhere by that, but I can't seem to remember what it's called right now. Something like "Universe" I think?
Unison: https://www.unison-lang.org/docs/the-big-idea/
I think the problem with "big" language ideas is, that as long as they match exactly your needs, they're great, but if they're slightly off, they can be a pain in the ass.
I'm wondering if languages could provide some kind of meta information, hooks or extension points, which could be used to implement big ideas on top. These big ideas could then be reused and modified depending on the needs of the project.
In which Jerf longs for PHP. Every single point has been in, and actively used, for a long while. The __call() & friends is particularly nifty - simple mental model, broad applicability, in practice used sparingly to great effect.
All in all a very enjoyable post.
The section about language support for modular monoliths reminds me of John Lakos's "Large-Scale C++ Software Design", which focuses on the physical design/layout of large C++ projects to enforce interfaces and reduce coupling and compilation time. Example recommendations include defining architecture layers using subdirectories and the PImpl idiom. It's pretty dated (1996, so pre-C++98), but still a unique perspective on an overlooked topic.
https://www.amazon.com/dp/0201633620
The only thing I want added to every programming language I use is the ability to call functions and handle data structures provided by libraries and services written in other languages without me having to write arcane wrappers.
It's not very convincing to me when the article talks about truly relational language but fails to mention Prolog and anything that we learned from it.
Logic languages are definitely not what I'd expect a relational-first language to look like.
What we learned from Prolog is mostly that starting from an exponentially-complex primitive and then trying to beat it into submission doesn't work at scale. Relational DBs don't have that problem. They do go n-squared and n-cubed and so forth easily, but there are lots of solutions to that as well.
I'm not sure what you mean with "an exponentially-complex primitive". In my opinion, Prolog lets you start with simple relations (n-squared, using your terms) and then enables you to build more complex relations using them.
Thanks - this was one of the more interesting things I've read here in a while.
I wonder if "Programming languages seem to have somewhat stagnated to me.", a sentiment I share, is just me paying less attention to them or a real thing.
I think there is innovation, but there's more than innovation required to be a good language. If a innovative feature is the cornerstone of a language, it frequently means that the language neglects pragmatic coding features that while not particularly special contribute to the language being nice to use.
I feel like in the next few years in languages will be things like Rust descendants where people with experience in using Rust want to keep what works for them but scales back some of the rigidity in favour of pragmatism.
It's also with noting that there are existing languages that are also changing over time. Freepascal has developed a lot of features over the years that make it fairly distant from original Pascal. More recent languages like Haxe are still developing into their final form. TypeScript has gone from a language that provided a tangible solution to an existing problem to a quagmire of features that I'd rather not have.
for "Semi-Dynamic Language" it might be worth looking into rpython: interpreters written in rpython have two phases, in the first phase one has full python semantics, but in the second phase everything is assumed to be less dynamic, more restricted (the r of rpython?) so the residual interpreter is then transpiled to C sources, which, although compiled, can also make use of the built-in GC and JIT.
I immediately thought of Julia as a semi-dynamic language. Julia is a dynamic language, but (as I understand it) the first time a function is called with a specific type signature, that specific method is JIT compiled as static LLVM.
Which is then used for future dispatches on that same signature and gives it very good performance. Julia is dynamic, and definitely beats the 10x slower than C barrier jerf mentioned.
For what I was using it for at the time (~3 years ago when I used it seriously) it offered performance close to the compiled orbital analysis code we had (in more conventional languages, Fortran and C) but with the flexibility for developing models of Python and other dynamic/interactive languages. An excellent tradeoff: very small performance cost for better interactivity and flexibility.
Sans the python compatiblity, rhai is pretty close to what rpython is trying to be.
https://github.com/rhaiscript/rhai
>https://github.com/rhaiscript/rhai#for-those-who-actually-wa...
(「 ⊙Д⊙)「
Of all the non-esolangs (=exolangs?) APL+kith seem to be almost designorismic/estuarine (formerly, Riverian) beasts..
In your informed opinion, how would it make sense to be thrilled thinking about (not just a semi-dynamic APL (=S-DAPL?) as above but) designing
Specifically, which of the sneering checklist items[0] would be killer to cross off?[0] https://www.mcmillen.dev/language_checklist.html
(Note in particular that the very APL inspired Wolfram has monopoly with physicists BUT does nothing for engineers
https://www.stephenwolfram.com/media/physics-whiz-goes-into-...
1988
>“But we tricked him, so to speak,” says Nobelist Murray Gell-Mann, who helped to bring Wolfram west. “We gave him a Ph.D.” )
> A Truly Relational Language... Value Database
I helped on a language called Eve about 10 years ago. A truly relational language was exactly what that language was supposed to be, or at least that's what we were aiming at as a solution for a user-centric programming language.
https://witheve.com
The language we came up with was sort of like Smalltalk + Prolog + SQL. Your program was a series of horn clauses that was backed by a Entity-Attribute-Value relational database. So you could write queries like "Search for all the clicks and get those whose target is a specific id, then as a result create a new fact that indicates a a button was pressed. Upon the creation of that fact, change the screen to a new page". We even played around with writing programs like this in natural language before LLMs were a thing (you can see some of that here https://incidentalcomplexity.com/2016/06/10/jan-feb/)
Here's a flappy bird game written in that style: https://play.witheve.com/#/examples/flappy.eve
It's very declarative, and you have to wrap you brain around the reactivity and working with collections of entities rather than individual objects, so programming this way can be very disorienting for people used to imperative OOP langauges.
But the results are that programs are much shorter, and you get the opportunity for really neat tooling like time travel debugging, where you roll the database back to a previous point; "what-if" scenarios, where you ask the system "what would happen if x were y" and you can potentially do that for many values of y; "why not" scenarios, where you ask the system why a value was not generated; value providence, where you trace back how a value was generated... this kind tooling that just doesn't exist with most languages due to how they languages are built to throw away as much information away as possible on each stage of compilation. The kind of tooling I'm describing requires keeping and logging information about your program, and then leveraging it at runtime.
Most compilers and runtimes throw away that information as the program goes through the compilation process and as its running. There is a cost to pay in terms of memory and speed, but I think Python shows that interpretation speed is not that much of a barrier to language adoptions.
But like I said, that was many years ago and that team has disbanded. I think a lot of what we had in Eve still hasn't reached mainstream programming, although some of what we were researching found its way into Excel eventually.
> Loosen Up The Functions... Capabilities... Production-Level Releases... Semi-Dynamic Language... Modular Monoliths
I really like where the author's head at, I think we have similar ideas about programming because I've been developing a language called Mech that fits these descriptors to some degree since Eve development was shut down.
https://github.com/mech-lang/mech
So this language is not supposed to be relational like Eve, but it's more like Matlab + Python + ROS (or Erlang if you want to keep it in the languages domain).
I have a short 10 min video about it here: https://www.hytradboi.com/2022/i-tried-rubbing-a-database-on... (brief plug for HYTRADBOI 2025, Jamie also worked on Eve, and if you're interested in the kinds of thing the author is, I'm sure you'll find interesting videos at HYTRADBOI '22 archives and similarly interested people at HYTRADBOI '25), but this video is out of date because the language has changed a lot since then.
Mech is really more of a hobby than anything since I'm the only one working on it aside from my students, who I conscript, but if anyone wants to tackle some of these issues with me I'm always looking for collaborators. If you're generally interested in this kind of stuff drop by HYTRADBOI, and there's also the Future Of Coding slack, where likeminded individuals dwell: https://futureofcoding.org. You can also find this community at the LIVE programming workshop which often coincides with SPLASH: https://liveprog.org
I remember both LightTable and Eve. At the time I thought they were both really interesting ideas but wasn't sure where they were going.
Re-reading the eve website now, with 10+ years more experience and understanding of languages I'm really astounded at how brilliant Eve was, and how far ahead of it's time it was (and still is). Also at how rare it is to have any revolutionary ideas in modern programming language design make it out of theory in contemporary times. There were many radical ideas in the 60 and 70s, but so much now is incremental.
It's a shame Eve couldn't continue, just to see what it would've become and the influence it would have had on language expectations. Really cool stuff in there. While not likely, I hope someone picks up those ideas and continues them.
Did the effort just run out of funding? Or did it hit a stumbling block?
As far as "semi-dynamic" goes, C# has an interesting take coming from the other direction - i.e. a fully statically typed language originally bolting dynamic duck typing later on.
It's done in a way that allows for a lot of subtlety, too. Basically you can use "dynamic" in lieu of most type annotations, and what this does is make any dispatch (in a broad sense - this includes stuff like e.g. overload resolution, not just member dispatch) on that particular value dynamic, but without affecting other values involved in the expression.
> in a broad sense - this includes stuff like e.g. overload resolution, not just member dispatch
It specifically allows for multiple dispatch that is not even available in most dynamic languages not called lisp!
> Some Lisps may be able to do all this, although I don’t know if they quite do what I’m talking about here; I’m talking about there being a very distinct point where the programmer says “OK, I’m done being dynamic” for any given piece of code.
In Common Lisp there are tricks you can pull like declaring functions in a lexical scope (using labels or flet) to remove their lookup overhead. But CL is generally fast enough that it doesn't really matter much.
You can declaim inline a toplevel function. That doesn't necessarily mean that it will be integrated into callers. Among the possible effects is that the dynamism of reference can be culled away. If a function A calls B where B is declaimed inline then A can be compiled to assume that B definition. (Such that if B is redefined at run-time, A can keep calling the old B, not going through the #'B function binding lookup.).
I seem to remember that Common Lisp compilers are allowed to do this for functions that are in the same file even if they are not declaimed inline. If A and B are in the same file, and B is not declaimed notinline (the opposite of inline), then A can be translated to assume the B definition.
So all your helper functions in a Lisp module are allowed to be called more efficiently, not having to go through the function binding of the symbol.
For relational, look into term-rewriting systems which just keep transforming specified relationships into other things. Maude’s rewriting logic and engine could probably be used for relational programming. It’s fast, too.
https://maude.cs.illinois.edu/wiki/The_Maude_System
As for "capabilities", I'm not sure I fully understand how that is advantageous to the convention of passing the helper function ("capability") as an argument to the "capable" function.
For instance, in Zig, you can see that a function allocates memory (capability) because it requires you to pass an allocator that it can call!
I'd like to see if others are more creative than me!
In Zig it's conventional to pass an allocator, but any code can end run around the convention by reaching for page_allocator or c_allocactor behind your back. Capabilities upgrade that convention into a guarantee.
That's pretty much how it plays out, as I understand it.
The trick is making sure that that object is the Only possible way to do the thing. And making more features like that, for example Networking, or File I/O, etc
Totally agree that programming languages are a bit stagnant, with most new features being either trying to squeeze a bit more correctness out via type systems (we're well into diminishing returns here at the moment), or minor QoL improvements. Both are useful and welcome but they aren't revolutionary.
That said, here's some of the feedback of the type you said you didn't want >8)
(1) Function timeouts. I don't quite understand how what you want isn't just exceptions. Use a Java framework like Micronaut or Spring that can synthesize RPC proxies and you have things that look and work just like function calls, but which will throw exceptions if they time out. You can easily run them async by using something like "CompletableFuture.supplyAsync(() -> proxy.myCall(myArgs))" or in Kotlin/Groovy syntax with a static import "supplyAsync { proxy.myCall(myArgs) }". You can then easily wait for it by calling get() or skip past it. With virtual threads this approach scales very well.
The hard/awkward part of this is that APIs are usually defined these days in a way that doesn't actually map well to standard function calling conventions because they think in terms of POSTing JSON objects rather than being a function with arguments. But there are tools that will convert OpenAPI specs to these proxies for you as best they can. Stricter profiles that result in more idiomatic and machine-generatable proxies aren't that hard to do, it's just nobody pushed on it.
(2) Capabilities. A language like Java has everything needed to do capabilities (strong encapsulation, can restrict reflection). A java.io.File is a capability, for instance. It didn't work out because ambient authority is needed for good usability. For instance, it's not obvious how you write config files that contain file paths in systems without ambient authority. I've seen attempts to solve this and they were very ugly. You end up needing to pass a lot of capabilities down the stack, ideally in arguments but that breaks every API ever designed so in reality in thread locals or globals, and then it's not really much different to ambient authority in a system like the SecurityManager. At least, this isn't really a programming language problem but more like a standard library and runtime problem.
(3) Production readiness. The support provided by app frameworks like Micronaut or Spring for things like logging is pretty good. I've often thought that a new language should really start by taking a production server app written in one of these frameworks and then examining all the rough edges where the language is mismatched with need. Dependency injection is an obvious one - modern web apps (in Java at least) don't really use the 'new' keyword much which is a pretty phenomenal change to the language. Needing to declare a logger is pure boilerplate. They also rely heavily on code generators in ways that would ideally be done by the language compiler itself. Arguably the core of Micronaut is a compiler and it is a different language, one that just happens to hijack Java infrastructure along the way!
What's interesting about this is that you could start by forking javac and go from there, because all the features already exist and the work needed is cleaning up the resulting syntax and semantics.
(4) Semi-dynamic. This sounds almost exactly like Java and its JIT. Java is a pretty dynamic language in a lot of ways. There's even "invokedynamic" and "constant dynamic" features in the bytecode that let function calls and constants be resolved in arbitrarily dynamic ways at first use, at which point they're JITd like regular calls. It sounds very similar to what you're after and performance is good despite the dynamism of features like lazy loading, bytecode generated on the fly, every method being virtual by default etc.
(5) There's a library called Permazen that I think gets really close to this (again for Java). It tries to match the feature set of an RDBMS but in a way that's far more language integrated, so no SQL, all the data types are native etc. But it's actually used in a mission critical production application and the feature set is really extensive, especially around smooth but rigorous schema evolution. I'd check it out, it certainly made me want to have that feature set built into the language.
(6) Sounds a bit like PL/SQL? I know you say you don't want SQL but PL/SQL and derivatives are basically regular programming languages that embed SQL as native parts of their syntax. So you can do things like define local variables where the type is "whatever the type of this table column is" and things like that. For your example of easily loading and debug dumping a join, it'd look like this:
It's not a beautiful language by any means, but if you want a natively relational language I'm not sure how to make it moreso.(7) I think basically all server apps are written this way in Java, and a lot of client (mobile) too. It's why I think a language with integrated DI would be interesting. These frameworks provide all the features you're asking for already (overriding file systems, transactions, etc), but you don't need to declare interfaces to use them. Modern injectors like Avaje Inject, Micronaut etc let you directly inject classes. Then you can override that injection for your tests with a different class, like a subclass. If you don't want a subtyping relationship then yes you need an interface, but that seems OK if you have two implementations that are really so different they can't share any code at all. Otherwise you'd just override the methods you care about.
Automatically working out the types of parameters sounds a bit like Hindley-Milner type inference, as seen in Haskell.
(8) The common way to do this in the Java world is have an annotation processor (compiler plugin) that does the lints when triggered by an annotation, or to create an IntelliJ plugin or pre-canned structural inspection that does the needed AST matching on the fly. IntelliJ's structural searches can be saved into XML files in project repositories and there's a pretty good matching DSL that lets you say things like "any call to this method with arguments like that and which is inside a loop should be flagged as a warning", so often you don't need to write a proper plugin to find bad code patterns.
I realize you didn't want feedback of the form "but X can do this already", still, a lot of these concepts have been explored elsewhere and could be merged or refined into one super-language that includes many of them together.
PL/SQL is an abomination of a language. It’s easily the worst example you could have given.
My 4 cents:
- I like the idea of a multiparadigm programming language (many exists) but where you can write part of the code in a different language, not trying to embed everything in the same syntax. I think in this way you can write code and express your ideas differently.
- A [social] programming language where some variables and workflows are shared between users [1][2].
- A superreflective programming language inspired by Python, Ruby, and others where you can override practically everything to behave different. For example, in Python you can override a function call for an object but not for the base system, globals() dict cannot be overriden. See [3]. In this way you save a lot of time writing a parser and the language basic logic.
- A declarative language to stop reinventing the wheel: "I need a website with a secure login/logout/forgot_your_password_etc, choose a random() template". It doesn't need to be in natural language though.
[1] https://blog.databigbang.com/ideas-egont-a-web-orchestration...
[2] https://blog.databigbang.com/egont-part-ii/
[3] https://www.moserware.com/2008/06/ometa-who-what-when-where-...
Egont sounds a bit like SQL, no? A social way to share data and work with it ... a shared RDBMS where everyone has a user account and can create tables/share them with other users, built in security, etc. Splat a GUI on top and you have something similar.
Modern web frameworks are getting pretty declarative. If you want a basic web app with a log in/out page that's not hard to do. I'm more familiar with Micronaut than Spring but you'd just add:
and the relevant dependencies. Now you write a class that checks the username/password, or use LDAP, or configure OAuth and the /login URL takes a POST of username/password. Write a bit of HTML that looks good for your website and you're done.https://micronaut-projects.github.io/micronaut-security/late...
> Egont sounds a bit like SQL, no? A social way to share data and work with it ... a shared RDBMS where everyone has a user account and can create tables/share them with other users, built in security, etc. Splat a GUI on top and you have something similar.
Yes, SQL or a global spreadsheet. I would say that it is like SQL plus a DAG or, we can imagine an aggregation of SQLs. The interesting thing is that parts of the global system are only recalculated if there is a change, like in a spreadsheet.
> a shared RDBMS where everyone has a user account and can create tables/share them with other users, built in security, etc. Splat a GUI on top and you have something similar.
We need a little bit more but not much more: security by namespaces and/or rows so the same database is shared but you can restrict who change what: your "rows" are yours. I think something like OrbitDB but with namespaces will be cool.
> Modern web frameworks are getting pretty declarative.
Yes but my proposal was at a higher level. I don't want to know what a cookie is when I just want to create a website. I am not saying that you can create complex components with this idea but you can create common use cases.
> solve the problem by making all function calls async.
This is just blocking code and it’s beautiful.
Is it just me or whatever "Capabilities" is, is not explained at all?
It is not. I didn't want to give a half explanation, but it is another case of the increasing difficulty in coming up with good Google searches anymore.
https://erights.org/elib/capability/ode/ode-capabilities.htm... is a good start.
But you use capabilities all the time... operating system users work that way. As a user, you can't "just" execute some binary somewhere and thereby get access to parts of the system your user doesn't have rights to. (Forget setuid for a second, which is intended precisely to get around this, and let's just look at the underlying primitive.)
Capabilities in programming languages take the granularity further down. You might call some image manipulation code in a way that it doesn't have the capability to manipulate the file system in general, for example, or call a function to change a user's login name with capabilities that only allow changing that user, even if another user ID somehow gets in there.
It would be a fairly comprehensive answer to the software dependency issues that continue to bubble up; it would matter less if a bad actor took over "leftpad" if leftpad was actively constrained by the language to only be able to manipulate strings, so the worst an actor could do is make it manipulate strings the wrong way, rather than start running arbitrary code. Or put another way, if the result of the bad actor taking the package wasn't that people got hacked but users started getting
which would immediately raise eyebrows.It's not a new idea, in that E already tried it, and bits and pieces of it are everywhere ("microkernels" is another place where you'll see this idea, but at the OS level and implemented in languages that have no native concept of the capabilities), but for the most part our programming languages do not reflect this.
> But you use capabilities all the time... operating system users work that way.
Most operating systems don't have proper capabilities - they use things like ACLS, RBAC, MAC, etc for permissions.
The golden rule of capabilities is that you should not separate designation from authority. The capability itself represents the authority to access something, and designates what is being accessed.
I think the Austral language tries to do some capability based things: https://austral-lang.org/.
Thanks for this link!
It has this other link that explains more on it : https://en.m.wikipedia.org/wiki/Capability-based_security
I think I get it now. I honestly never had heard about this before and trying to Google search from the original post I was coming up empty.
For the equivalent in operating systems land, look at the respective manual pages for Linux capabilities[1] or OpenBSD pledge[2] and unveil[3]. The general idea is that there are some operations that might be dangerous, and maybe we don't want our program to have unrestricted access to them. Instead, we opt-in to the subset that we know we need, and don't have access to the rest.
There's some interest in the same thing, but at the programming language level. I'm only aware of it being implemented academically.
[1]: https://man7.org/linux/man-pages/man7/capabilities.7.html [2]: https://man.openbsd.org/pledge.2 [3]: https://man.openbsd.org/unveil.2
I don't think that Linux capabilities have much to do with the capabilities that the OP intends.
In a capabilities system, a program has permission to act on any object if it has a reference (aka a capability) to the object, there is no other access control. A program acquires a capability either by receiving it from is parent (or caller in the case of a function) or some other way like message passing. There is no other source of capabilities and they are unforgeable.
Unix file descriptors act in many ways as capabilities: they are inherited by processes from their parents and can be passed around via Unix sockets, and grant to the FD holder the same permissions to the referenced object as the creator of the file descriptor.
Of course as Unix has other ways from creating file descriptors other than inheritance and message passing is not truly a capabilities system.
It's implemented in Java! .NET tried it too, UNIX file descriptors are capabilities, Mach ports are capabilities. Capabilities are widely used far outside of academia and have been for a long time.
What people often mean when they say this is a so-called pure capability system, where there are no ambient permissions at all. Such systems have terrible usability and indeed have never been made to work anywhere, not even in academia as far as I know.
> This is not a new idea, so I won’t go deeply into what it is
So, no, the author claims it too.
Capabilities are a way to do access control where the client holds the key to access something, instead of the server holds a list of what is allowed based on the clients identities.
But when people use that word, they are usually talking about fine-grained access control. On a language level, that would mean not granting access for example for a library to do network connections, even though your program as a whole has that kind of access.
Kind of. At a more fundamental level, it's applying the idea that "invalid states should be unrepresentable" (e.g. https://hugotunius.se/2020/05/16/making-invalid-state-unrepr... ) but to our code itself.
For example, consider a simple function to copy files. We could implement it like this:
However, there are many ways that things could go awry when writing code like this; e.g. it will write to the wrong file, since I forgot to put the real `out` value back after testing (oops!). Such problems are only possible because we've given this function the capability to call `fs.open` (in many languages the situation's even worse, since that capability is "ambient": available everywhere, without having to be passed in like `fs` above). There are also other capabilities/permissions/authorities implicit in this code, since any call to `fs.open` has to have the right permissions to read/write those files.In contrast, consider this alternative implementation:
This version can't use the wrong files, since it doesn't have any access to the filesystem: there's literally nothing we could write here that would mean "open a file"; it's unrepresentable. This code also can't mix up the input/output, since only `inH` has a `.read()` method and only `outH` has a `.write()` method. The `fs.open` calls will still need to be made somewhere, but there's no reason to give our `copy` function that capability.In fact, we can see the same thing on the CLI:
- The first version is like `cp oldPath newPath`. Here, the `cp` command needs access to the filesystem, it needs permission to open files, and we have to trust that it won't open the wrong files.
- The second version is like `cat < oldPath > newPath`. The `cat` command doesn't need any filesystem access or permissions, it just dumps data from stdin to stdout; and there's no way it can get them mixed up.
The fundamental idea is that trying to choose whether an action should be allowed or not (e.g. based on permissions) is too late. It's better if those who shouldn't be allowed to do an action, aren't even able to express it at all.
You're right that this can often involve "keys", but that's quite artificial: it's like adding extra arguments to each function, and limiting which code is scoped to see the values that need to be passed as those arguments (e.g. `fs.openRead(inPath, keyThatAllowsAccess)`), when we could have instead scoped our code to limit access to the functions themselves (though for HTTP APIs, everything is a URL; so "unguessable function endpoint URL" is essentially the same as "URL with secret key in it")
It's a fancy word for "access things through a handle".
The key property being that everything can only be accessed via handles, including, recursively, other handles (i.e. to get an handle to an object you need first to already have an handle to the handle-giver for that object).
A capability is basically a reference which both designates some resource to be accessed and provides the authority to access it. The authority is not held somewhere else like an Access Control List - the reference is the authority. Capabilities must be unforgeable - they're obtained by delegation.
---
To give an example of where this has been used in a programming language, Kernel[1] uses a capability model for mutating environments. Every function (or operative) has an environment which holds all of its local variables, the environment is encapsulated and internally holds a reference to the parent environment (the surrounding scope). The rule is that we can only mutate the local variables of an environment to which we have a direct reference, but we cannot mutate variables in the parents. In order to mutate the variables in the parent, we must have a direct reference to the parent, but there is no mechanism in the language to extract the parent reference from the environment it is encapsulated in.
For example, consider the following trivial bit of code: We define some variable `x` with initial value "foo", we then mutate it to have the value "bar", then look up `x`.
As expected, this returns "bar". We have a direct reference to the local environment via `(get-current-environment)`.Technically we could've just written `($define! x "bar")`, where the current environment is assumed, but I used `$set!` because we need it for the next example.
When we introduce a new scope, the story is different.
Here we create a function foo, which has its own local environment, with the top-level environment as its parent. We can read `x` from inside this environment, but we can't mutate it. In fact, this code inserts a new variable `x` into the child environment which shadows the existing one within the scope of the function, but after `foo` has returned, this environment is lost, so the result of the computation is "foo". There is no way for the body of this lambda to mutate the top-level environment here because it doesn't have a direct reference to it.So far basically the same static scoping rules you are used to, but environments in Kernel are first-class, so we can get a reference to the top-level environment which grants the child environment the authority to mutate the top level environment.
And the result of this computation is "bar".However, by binding `env` in the top-level environment, all child scopes can now have the ability to mutate the top-level.
To avoid polluting the environment in such way, the better way to write this is with an operative (as opposed to $lambda), which implicitly receives the caller's environment as an argument, which it binds to a variable in its local environment.
Now `foo` specifically can mutate it's caller's local environment, but it can't mutate the variables of the caller of the caller, and we have not exposed this authority to all children of the top-level.---
This is only a trivial example, but we can do much more clever things with environments in Kernel. We can construct new environments at runtime, and they can have multiple parents, ultimately forming a DAG, where environment lookup is a Depth-First-Search, but the references to the parent environments are encapsulated and cannot be accessed, so we cannot mutate parent scopes without a direct reference - we can only mutate the root node of the DAG for an environment to which we have a direct reference. The direct reference is a capability - it's both the means and the authority to mutate.
---
We can use these first-class environments in conjunction with things like `$remote-eval`, which evaluates some piece of code in an environment provided by the user, which may contain only the bindings they specify, and does not capture anything from the surrounding scope.
However, if we try to use some feature like IO, to write an output to the console. We get an error, `write` is unbound - even though `write` is available in the scope in which we performed this evaluation. We could catch this error with a guarded continuation so the program does not crash.This combination of features basically let you create "mini sandboxes", or custom DSLs, with more limited capabilities than the context in which they're evaluated. Most languages only let you add new capabilities to the static environment, by defining new functions and types - but how many languages let you subtract capabilities, so that fewer features are available in a given context? Most languages do this purely at compile time via a module/import system, or with static access modifiers like `public` and `private`. Kernel lets you do this at runtime.
---
One thing missing from this example, which is required for true capabilities, is the ability to revoke the authority. The only way we could revoke the capability of a function to mutate an environment is to suspend the program.
Proper capabilities allow revocation at any time. If the creator of a capability revokes the authority, this should propagate to all duplicated, delegated, or derived capabilities with immediate effect. The capabilities that were held become "zombies", which no longer provide the means nor the authority - and this is why it is essential that we don't separate designation from authority, and why these should both be encapsulated in the capability.
This clearly makes it difficult to provide proper capabilities in programming languages, because we have to handle every possible error where we attempt to access a zombie capability. The use of such capabilities should be limited to where they really matter such as access to operating system resources, cryptographic keys, etc. where it's reasonable to implement robust error handling code. We don't want capabilities for every programming language feature because we would need to insert error checks on every expression to handle the potential zombie. Attempting to check if a capability is live before using it is no solution anyway, because you would have race conditions, so the correct approach to using them is to just try and catch the error if it occurs.
Another take-away from this is that if capabilities are provided in a language via a type system, it must be a dynamic type system. You cannot grant authority in a static type system at compile time if the capability may have already been revoked by the time the program is run. Capabilities are inherently dynamic by nature because they can be revoked at any time. This doesn't mean you can't use capabilities in conjunction with a static type system - only that the static type system can't really represent capabilities.
You can find out a lot more about them on the erights page that others have linked, and I would recommend looking into seL4 if you're interested in how they're applied to operating systems.
---
[1]:http://web.cs.wpi.edu/%7Ejshutt/kernel.html
Several unrelated comments:
* In general, whenever I hear "compiler will optimize this", I die a little on the inside. Not even because it's delegating solution of the newly created problem to someone else, but because it creates a disconnect between what the language tells you is possible and what actually is possible. It encourages this kind of multi-layer lie that, in anger, you will have to untangle later, and will be cursing a lot, and will not like the language one bit.
* Capabilities. Back in the days when ActionScript 3 was relevant, there was a big problem of dynamic code sharing. Many services tried to implement module systems in AS3, but the security was not done well. To give you some examples: a gaming portal written in AS3 wants to load games written by programmers who aren't the portal programmers (and could be malicious, i.e. trying to steal data from other programs, or cause them to malfunction etc.) ActionScript (and by extension JavaScript 4) had a concept of namespaces borrowed from XML (so not like in C++), where availability of particular function was, beside other things, governed by whether the caller is allowed to access the namespace. There were some built-in namespaces, like "public", "private", "protected" and "internal" that functioned similar to Java's namesakes. But users were allowed to add any number of custom namespaces. These namespaces could be then shared through a function call in a public namespace. I.e. the caller would have to call the function and supply some kind of a password, and if password matched, the function would return the namespace object, and then the caller could use that namespace object to call the functions in that namespace. I tried to promote this concept in Flex Framework for dealing with module loading, but that never was seriously considered... Also, people universally hated XML namespaces (similar to how people seem to universally hate regular expressions). But, I still think that it could've worked...
* All this talk about "dynamic languages"... I really don't like it when someone creates a bogus category and then says something very general about it. That whole section has no real value.
* A Truly Relation Language -- You mean, like Prolog? I wish more relational databases exposed their content via Prolog(like) language in addition to SQL. I believe it's doable, but very few people seem to want it, and so it's not done.
"Semi-Dynamic Language" - Zig?
"Value Database" - Mumps? lol
“It feels like programming languages are stagnating.”
As they should be. Not every language needs to turn into C++, Rust, Java, C#, or Kotlin.
The only group I see lamenting about features these days are PL theorists, which is fine for research languages that HN loves but very few use outside the bubble.
Some of us like the constraints of C or Go.
Folks, this is not a process that converges. We've now had 60 years of language design, use and experience. We're not going to get to an ideal language because there are (often initially hidden) tradeoffs to be made. Everyone has a different idea of which side of each tradeoff should be taken. Perhaps in the future we can get AI to generate and subsequently re-generate code, thereby avoiding the need to worry too much about language design (AI doesn't care that it constantly copies/pastes or has to refactor all the time).
Practically it has kinda converged, though not necessarily on a global optimum. Most coding is done in relatively few languages.
Marklar
https://www.youtube.com/watch?v=BSymxjrzdXc
I found it amusing most of the language is supposedly contextually polymorphic by definition. =3
For "value database", it seems to me that the trick is, you can't just ship the executable. You have to ship the executable plus the stored values, together, as your installation image. Then what you run in production is what you tested in staging, which is what you debugged on your development system.
I mean, there still may be other things that make this a Bad Idea(TM). But I think this could get around the specific issue mentioned in the article.
Basically, it's the Windows registry.
If it's about well-contained applications in a well designed (and user-centric) OS with a proper concept of "application" and "installation", with a usable enough mechanism, I don't see anything that would make it bad.
On Windows it's a disaster. To the point that dumping random text files around in Linux works better.