Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not building with newest LDC D compiler #378

Open
cyrusmsk opened this issue Mar 13, 2023 · 21 comments
Open

Not building with newest LDC D compiler #378

cyrusmsk opened this issue Mar 13, 2023 · 21 comments

Comments

@cyrusmsk
Copy link
Contributor

Just found this issue in my fork and want to warn you @hanabi1224. So before merging the next PR maybe we could specify the version of compiler in build-d.yaml
The issue in the library's GitHub is already created. I will update this thread when new information will come.
Sorry for inconvenience.

@cyrusmsk
Copy link
Contributor Author

So it seems the issue is fixed. And will be available in 1.32.1 version of LDC.

@renatoathaydes
Copy link

renatoathaydes commented Dec 15, 2023

I couldn't compile the binary tree 1.d solution until I fixed the imports... not sure if that was written for very old D versions?

Here's what works for me with D v2:

@safe:
import std.stdio;
import std.conv;
import std.algorithm.comparison;
import std.format;

extern(C) __gshared string[] rt_options = [ "gcopt=minPoolSize:300" ];

const MIN_DEPTH = 4;

class Node {
    private Node left;
    private Node right;

    this(Node left, Node right)
    {
        this.left = left;
        this.right = right;
    }

    int check() {
        auto r = 1;
        auto ln = this.left;
        auto rn = this.right;
        if (ln) {
            r += ln.check();
        }
        if (rn) {
            r += rn.check();
        }
        return r;
    }

    static Node create(int depth) {
        if (depth > 0) {
            auto d = depth - 1;
            return new Node(Node.create(d), Node.create(d));
        }
        return new Node(null, null);
    }
}

void main(string[] args)
{
    auto n = args.length > 1 ? args[1].to!int() : 6;
    auto maxDepth = max(MIN_DEPTH + 2, n);
    auto stretchDepth = maxDepth + 1;
    auto stretchTree = Node.create(stretchDepth);
    writeln(format("stretch tree of depth %d\t check: %d", stretchDepth, stretchTree.check()));
    auto longLivedTree = Node.create(maxDepth);

    for (int depth = MIN_DEPTH; depth <= maxDepth; depth += 2) {
        auto iterations = 1 << (maxDepth - depth + MIN_DEPTH);
        auto sum = 0;
        for (auto i = 0; i < iterations; i++) {
            sum += Node.create(depth).check();
        }
        writeln(format("%d\t trees of depth %d\t check: %d", iterations, depth, sum));
    }

    writeln(format("long lived tree of depth %d\t check: %d", maxDepth, longLivedTree.check()));
}

This is much slower than Java and Dart, for example, which seem very strange to me. Anyone has any idea why that would be? I suppose this problem is testing mostly the GC and speed of allocation, but still D shouldn't be 5x slower as it currently is, but I couldn't speed this up myself as I don't know D very well.

@cyrusmsk
Copy link
Contributor Author

Hi @renatoathaydes
Current version is working for me on LDC 1.35 (macOS).

Which compiler version are you using?
Because
import std;
Should just works in many scenarios. It is not an optimal way - but it is fine way to use it.

Regarding the speed - you can see that on top of the list in this problem languages with VM - Dart and JVM like..
But of course D solution could be improved.
Also keep in mind that arena allocations are forbidden by the rules of the benchmark, otherwise languages like C++/Zig/Rust/D - will have much better performance than Java.

@cyrusmsk
Copy link
Contributor Author

cyrusmsk commented Dec 16, 2023

Ok. I've made some changes: more close to other solutions.. moved from class to struct. And it is working a bit faster on my machine.
Code:

@safe:
import std;

extern(C) __gshared string[] rt_options = [ "gcopt=minPoolSize:300 initReserve:300" ];

const MIN_DEPTH = 4;
static empty = Node(null, null);

struct Node {
    Node* left;
    Node* right;
}


int check(Node* n) {
    if (n.left is null)
        return 1;
    return 1 + check(n.left) + check(n.right);
}

Node* bottomUpTree(int d) {
    if (d <= 0) {
        return &empty;
    }
    return new Node(bottomUpTree(d - 1), bottomUpTree(d - 1));
}

void main(string[] args)
{
    auto n = args.length > 1 ? args[1].to!int() : 6;
    auto maxDepth = max(MIN_DEPTH + 2, n);
    auto stretchDepth = maxDepth + 1;
    auto checkTree = check(bottomUpTree(stretchDepth));
    writeln(format("stretch tree of depth %d\t check: %d", stretchDepth, checkTree));
    auto longLivedTree = bottomUpTree(maxDepth);

    for (int depth = MIN_DEPTH; depth <= maxDepth; depth += 2) {
        auto iterations = 1 << (maxDepth - depth + MIN_DEPTH);
        auto sum = 0;
        for (auto i = 0; i < iterations; i++) {
            sum += check(bottomUpTree(depth));
        }
        writeln(format("%d\t trees of depth %d\t check: %d", iterations, depth, sum));
    }

    writeln(format("long lived tree of depth %d\t check: %d", maxDepth, check(longLivedTree)));
}

For comparison of performance:
Go version:

long lived tree of depth 18 check: 524287
1,50 real 4,89 user 0,34 sys

Current D version:

long lived tree of depth 18 check: 524287
1,09 real 1,05 user 0,03 sys

Updated D version:

long lived tree of depth 18 check: 524287
0,63 real 0,59 user 0,04 sys

@renatoathaydes
Copy link

Hi @cyrusmsk

I was using gdc and that doesn't work on gdc. I am testing this on Linux.

I can compile your code using DMD and it does run, and is a bit faster.

DMD time:

➜  d dmd -of=main -O main.d
➜  d time ./main 18
stretch tree of depth 19	 check: 1048575
262144	 trees of depth 4	 check: 8126464
65536	 trees of depth 6	 check: 8323072
16384	 trees of depth 8	 check: 8372224
4096	 trees of depth 10	 check: 8384512
1024	 trees of depth 12	 check: 8387584
256	 trees of depth 14	 check: 8388352
64	 trees of depth 16	 check: 8388544
16	 trees of depth 18	 check: 8388592
long lived tree of depth 18	 check: 524287
./main 18   1.62s  user 0.12s system 99% cpu 1.740 total
avg shared (code):         0 KB
avg unshared (data/stack): 0 KB
total (sum):               0 KB
max memory:                313 MB
page faults from disk:     0
other page faults:         79522

LDC2:

➜  d source ~/dlang/ldc-1.35.0/deactivate         
source: no such file or directory: /home/renato/dlang/ldc-1.35.0/deactivate
➜  d source ~/dlang/ldc-1.35.0/activate  
(ldc-1.35.0)➜  d ldc2 -of=main -O main.d
(ldc-1.35.0)➜  d time ./main 18
stretch tree of depth 19	 check: 1048575
262144	 trees of depth 4	 check: 8126464
65536	 trees of depth 6	 check: 8323072
16384	 trees of depth 8	 check: 8372224
4096	 trees of depth 10	 check: 8384512
1024	 trees of depth 12	 check: 8387584
256	 trees of depth 14	 check: 8388352
64	 trees of depth 16	 check: 8388544
16	 trees of depth 18	 check: 8388592
long lived tree of depth 18	 check: 524287
./main 18   1.25s  user 0.16s system 99% cpu 1.416 total
avg shared (code):         0 KB
avg unshared (data/stack): 0 KB
total (sum):               0 KB
max memory:                312 MB
page faults from disk:     0
other page faults:         79508

I was surprised that DMD gets the same speed as LDC.

For reference, here's the Dart times (which I submitted in a PR), which is faster than the #1 Java solution (so this should be faster than the Kotlin one as well):

(ldc-1.35.0)➜  d dart compile exe tree.dart
Generated: /home/renato/programming/experiments/d/tree.exe
(ldc-1.35.0)➜  d time ./tree.exe 18
stretch tree of depth 19	 check: 1048575
262144	 trees of depth 4	 check: 8126464
65536	 trees of depth 6	 check: 8323072
16384	 trees of depth 8	 check: 8372224
4096	 trees of depth 10	 check: 8384512
1024	 trees of depth 12	 check: 8387584
256	 trees of depth 14	 check: 8388352
64	 trees of depth 16	 check: 8388544
16	 trees of depth 18	 check: 8388592
long lived tree of depth 18	 check: 524287
./tree.exe 18   0.65s  user 0.05s system 103% cpu 0.678 total
avg shared (code):         0 KB
avg unshared (data/stack): 0 KB
total (sum):               0 KB
max memory:                63 MB
page faults from disk:     0
other page faults:         15755

So D is still much slower, disappointingly.

@cyrusmsk
Copy link
Contributor Author

Usually Dart will be much slower than D. Just this algorithm that proposed to be used for this problem not doing well..

@cyrusmsk
Copy link
Contributor Author

For example you can find another approach of BinaryTrees.. https://github.com/BinaryTrees
this implementation should be much more efficient in compiled C++/D/Rust than in Dart/Java/Kotlin…

@cyrusmsk
Copy link
Contributor Author

@renatoathaydes
Copy link

Usually Dart will be much slower than D.

I also ran a HTTP Server and Dart again was quite a bit faster. In which problems is Dart slower?

@cyrusmsk
Copy link
Contributor Author

cyrusmsk commented Dec 16, 2023

Usually Dart will be much slower than D.

I also ran a HTTP Server and Dart again was quite a bit faster. In which problems is Dart slower?

Many of them actually. https://programming-language-benchmarks.vercel.app/dart-vs-d
Yeah - server implementation is just broken for D in this repo)
For HTTP server you can check something like this https://web-frameworks-benchmark.netlify.app/result?l=dart,d
D has not the fastest servers.. but doing good enough

@renatoathaydes
Copy link

The HTTP Server I tried was not from this repository but from the "canonical" servers in each language (Dart has one in the stdlib and for D I used Vibe.d).

In the comparisons page you linked, D is faster in some, slower in some, much faster in some, much slower in some, but mostly it's pretty close - with D perhaps having at most a small edge ... so I think your assertion that Dart is much slower than D doesn't reflect that page at all.

@cyrusmsk
Copy link
Contributor Author

cyrusmsk commented Dec 17, 2023

The HTTP Server I tried was not from this repository but from the "canonical" servers in each language (Dart has one in the stdlib and for D I used Vibe.d).

In the comparisons page you linked, D is faster in some, slower in some, much faster in some, much slower in some, but mostly it's pretty close - with D perhaps having at most a small edge ... so I think your assertion that Dart is much slower than D doesn't reflect that page at all.

Let's check closely. For more accurate analysis.
I'm not sure which work Google made for AOT in Dart - is it based on LLVM optimization?

Where Dart is faster:

  1. HTTP server - doesn't count. Because check the specific web-benchmark. ✅
  2. Binarytrees, merkletrees - because of rules of the benchmark. Compiled langs forbidden to use arena/memorypool strategies, while VM-based langs most probably use it under the hood. If provide proper implementation in C++/Rust/D - it will be faster. See original benchmark games repo. ✅
  3. Coro-prime-sieve - it was not my implementation (implemented by author of the repo who is not very good at D). Mostly it is based on async. Need to check D's implementation if it is could be better. It will be definitely more complex - because D doesn't have async/await. But not sure if it should be slower, if properly implemented. 🟡
  4. Pidigits - used BigInt, which is quite slow for big values in the std. But it is possible to use GMP (and some langs just use GMP inside). ✅
    Please try to compare this code (make sure you added gmp-d into dependency):
@safe:
import std;
import std.outbuffer : OutBuffer;
import gmp.z;

alias Z = CopyableMpZ;


void main(string[] args)
{
    auto digits_to_print = args.length > 1 ? args[1].to!int() : 27;

    immutable one = 1.Z;
    immutable two = 2.Z;
    immutable ten = 10.Z;

    int digits_printed = 0;
    auto k = 1.Z;
    auto n1 = 4.Z;
    auto n2 = 3.Z;
    auto d = 1.Z;
    Z u,v,w;

    OutBuffer b = new OutBuffer();
    while(true) {

        u = n1 / d;
        v = n2 / d;
        if (u == v) {
            b.writef("%s",u);
            digits_printed += 1;
            int digits_printed_mod = digits_printed % 10;
            if (digits_printed_mod == 0)
                b.writef("\t%d\n", digits_printed);

            if (digits_printed >= digits_to_print) {
                if (digits_printed_mod > 0) {
                    foreach(_; 0..(10 - digits_printed_mod))
                        b.write(" ");
                    b.writef("\t%d\n", digits_printed);
                }

                write(b.toString());
                return;
            }

            auto to_minus = u * ten * d;
            n1 = n1 * ten - to_minus;
            n2 = n2 * ten - to_minus;
        }
        else {
            auto k2 = k * two;
            u = n1 * (k2 - one);
            v = n2 * two;
            w = n1 * (k - one);
            n1 = u + v;
            u = n2 * (k + two);
            n2 = w + u;
            d = d * (k2 + one);
            k = k + one;
        }
    }
}

@cyrusmsk
Copy link
Contributor Author

The HTTP Server I tried was not from this repository but from the "canonical" servers in each language (Dart has one in the stdlib and for D I used Vibe.d).

In the comparisons page you linked, D is faster in some, slower in some, much faster in some, much slower in some, but mostly it's pretty close - with D perhaps having at most a small edge ... so I think your assertion that Dart is much slower than D doesn't reflect that page at all.

Ok. With help of 'brianush1' the code for Coro-prime-sieve was prepared. It's slower than Go, but should be close in performance to Rust/Crystal I think. And I assume faster than Dart.

import core.thread;
import std;
import std.outbuffer: OutBuffer;

class Generator(T) : Fiber {
	private T value;

	this(void delegate() dg) {
		super(dg);
	}

	static void yield(T value) {
		(cast(Generator!T) Fiber.getThis()).value = value;
		Fiber.yield();
	}

	T getNext() {
		call();
		return value;
	}

	int opApply(scope int delegate(T) dg) {
		int result = 0;

		while (state != State.TERM) {
			result = dg(getNext());
			if (result)
				break;
		}
		return result;
	}
}

auto generate() {
	return new Generator!long({
		long i = 2;
		while (true) {
			Generator!long.yield(i);
			i += 1;
		}
	});
}

auto filter(Generator!long ch, long prime) {
	return new Generator!long({
		foreach (i; ch)
			if (i % prime != 0)
				Generator!long.yield(i);
	});
}

void main(string[] args) {
    long n = args.length > 1 ? args[1].to!long: 10;

    auto buf = new OutBuffer();
    scope(exit)
        write(buf.toString());

	Generator!long ch = generate();
	foreach (i; 0 .. n) {
		long prime = ch.getNext();
		buf.writef("%d\n",prime);
		ch = filter(ch, prime);
	}
}

@renatoathaydes
Copy link

Not sure if it's valid, but D has a very fast HTTP server as well: https://github.com/tchaloupka/httparsed

Also, are you also trying to speed up the Dart solution?

@cyrusmsk
Copy link
Contributor Author

Not sure if it's valid, but D has a very fast HTTP server as well: https://github.com/tchaloupka/httparsed

D server unfortunately not very fast( httpparser is good yes.

Also, are you also trying to speed up the Dart solution?

No, I’ve never worked with Dart

@renatoathaydes
Copy link

No, I’ve never worked with Dart

Ok, but then I suggest you avoid making claims about Dart being much slower than whatever.

@cyrusmsk
Copy link
Contributor Author

No, I’ve never worked with Dart

Ok, but then I suggest you avoid making claims about Dart being much slower than whatever.

Do you know type of programs, algorithms or something where Dart will have same performance as C++/Rust?

@renatoathaydes
Copy link

I don't know why you're asking about that, it has nothing to do with this thread. My original question was why D was running 2 to 3 times slower than Dart, which I found strange... but it seems quite a few Dart examples in this repo are faster than D... you then made an unfounded claim that Dart is much slower than D.

I feel I have to remind you of a non-motivation of the benchmark games:

We are profoundly uninterested in claims that these measurements, of a few tiny programs, somehow define the relative performance of programming languages aka "Which programming language is fastest."

@cyrusmsk
Copy link
Contributor Author

I don't know why you're asking about that, it has nothing to do with this thread. My original question was why D was running 2 to 3 times slower than Dart, which I found strange... but it seems quite a few Dart examples in this repo are faster than D... you then made an unfounded claim that Dart is much slower than D.

I feel I have to remind you of a non-motivation of the benchmark games:

We are profoundly uninterested in claims that these measurements, of a few tiny programs, somehow define the relative performance of programming languages aka "Which programming language is fastest."

Please sorry if I sound offensive or something bad to Dart. I haven’t got any of these intentions.

For example, I thought it is a common knowledge that languages like JS, Ruby and Python in general quite slow in comparison to languages like C++/Rust/D. So I thought that Dart is from the first category languages. It doesn’t mean that it is bad or something.

@cyrusmsk
Copy link
Contributor Author

cyrusmsk commented Dec 25, 2023

I don't know why you're asking about that, it has nothing to do with this thread. My original question was why D was running 2 to 3 times slower than Dart, which I found strange... but it seems quite a few Dart examples in this repo are faster than D... you then made an unfounded claim that Dart is much slower than D.

Answering at your question: it is because bad implementation in D, or some rules of the benchmark.
In theory D could have the same speed as C (aka C++/Rust level), which is the fastest languages currently presented in the benchmark.
Of course some implementations of standard library (Phobos) of D are not designed to be as much performant as possible. For those cases you need to create your own implementations usually.
But in general D could have same speed as C++/Rust.

@renatoathaydes
Copy link

@cyrusmsk no offense taken, I understand where your impression comes from and I think you're right that Dart is normally placed in the same category as JS and Python... but the interesting thing is that since Dart's turn to a fully static type system with null-safe and compilation directly to binary executables, it became a lot closer to languages like D and Go, actually... I think it's doing very well in the performance metrics I've seen (including this benchmark) so perhaps it's time for Dart to be considered in a different light.

I want to also mention I have nothing against D, to the contrary I am currently learning it and I like it a lot... and I understand that you're correct that it can achieve the same performance as C if you get out of your way to optimise it (though as you mention, the stdlib seems to not prioritize performance).

I know Dart much better than I know D, but am thinking of using D more and more because of its ability to go "lower" than Dart (and my other languages, Java, Kotlin, JS, Groovy...) and my unfortunate distaste of Rust, which would absolutely be the most obvious choice today (it's a great language, but the combination of requiring a large number of libraries for anything and the annoyingly slow compiler made me unable to continue using it). That should explain why I reached out to figure out what was going on with D not being much faster than Dart from my initial impressions! It's a lot clearer now what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants