Scheme + LLVM JIT

Hi List,

I am in the preliminary stages of adding a JIT compiler to a sizable
Scheme system (PLT Scheme). The original plan was to use GNU
Lightning, but 1) it seems to be dead, and 2) LLVM has already done a
huge amount of stuff that I would have had to write (poorly) from
scratch.

At the moment, LLVM seems to be the ideal choice for implementing the
Scheme JIT, but there are problems that need to be addressed first. I
hope you guys can help me with these - I'll list them in descending
order of importance.

Tail Call Elimination:

I've read over the "Random llvm notes", and see that you guys have
though about this already.

However, the note dates from last year, so I am wondering if there is
an implementation in the works. If no one is working on this or is
planning to work on this in the near future, I would be willing to
give it a shot if I was given some direction as to where to start.

Explicitly managed stack frames would also be nice, but are not a
necessity unlike the mixed calling conventions and tail call
elimination. For all of you who are wondering about call/cc, we
currently implement it via stack copying (and will continue to), so I
am not worried about llvm not having a representation for
continuations.

JIT + Optimization interactions:

I have looked over the JIT documentation (which is a bit sparse) and
the examples. So far I am completely unclear as to what the JIT
compiler actually does with the code that is passed to it.

To be more precise, does the JIT perform all of the standard llvm
optimizations on the code, or does it depend on it's client to do so
himself? Are there some examples of that?

If it does indeed optimize the input, does it attempt to do global
optimizations on the functions (intraprocedural register allocation,
inlining, whatever)?

Does it re-do these optimizations when functions are added/ removed/
changed? Are there parameters to tune the compiler's aggressiveness?

C-Interface:

Does there happen to be a C interface to the jit ? Our scheme impl
has a good FFI, but it doesn't do C++. If not, this is no big deal,
and i'll just write something myself.

Size of Distro/ Compilation Speed

While the sources of llvm are not that big, the project builds very
slowly into something very large. Someone already asked about what is
the minimum needed for just a JIT compiler, and I think I have a
vague idea of what needs to tweaked. However, I want to minimize the
changes I make to my llvm tree. I know that no one can make g++ run
any faster, but part of the speed problem and resulting size of the
compilation is that the configure script seems to ignore my
directives. For examle, it always builds all architectures, and it
always statically links each binary.

Well, that's all I can think of for now. Any help will be greatly
appreciated :slight_smile:

Hi, Alexander!

I am in the preliminary stages of adding a JIT compiler to a sizable
Scheme system (PLT Scheme).

Cool!

The original plan was to use GNU Lightning, but 1) it seems to be
dead, and 2) LLVM has already done a huge amount of stuff that I would
have had to write (poorly) from scratch.

Maybe we can use you for a testimonial... :slight_smile:

At the moment, LLVM seems to be the ideal choice for implementing the
Scheme JIT, but there are problems that need to be addressed first. I
hope you guys can help me with these - I'll list them in descending
order of importance.

Sounds good, I'll do my best.

Tail Call Elimination:

I've read over the "Random llvm notes", and see that you guys have
though about this already.

However, the note dates from last year, so I am wondering if there is
an implementation in the works. If no one is working on this or is
planning to work on this in the near future, I would be willing to
give it a shot if I was given some direction as to where to start.

To the best of my knowledge, this has not been done and no one has
announced their intent to work on it, so if you are interested, you'd be
more than welcome to do so.

I have looked over the JIT documentation (which is a bit sparse) and
the examples. So far I am completely unclear as to what the JIT
compiler actually does with the code that is passed to it.

A target runs the passes listed in the method
<target>JITInfo::addPassesToJITCompile() to emit machine code.

To be more precise, does the JIT perform all of the standard llvm
optimizations on the code, or does it depend on it's client to do so
himself? Are there some examples of that?

No, the JIT performs no optimizations. The method I mentioned above
just lowers the constructs the instruction selector cannot handle (yet)
or things that the target does not have support for. About the only
thing the JIT does (in X86) is eliminate unreachable blocks (dead code).
Then, it's passed on to the instruction selector which creates machine
code and some peephole optimizations are ran, then prolog/epilog are
inserted. I glossed over the x86 floating point details, but you get
the idea.

The use case scenario is usually like this:

llvm-gcc/llvm-g++ produces very simple, brain-dead code for a given
C/C++ file. It does not create SSA form, but creates stack allocations
for all variables. This makes it easier to write a front-end. We
turned off all optimizations in GCC and so the code produced by the
C/C++ front-end is really not pretty.

Then, gccas is run on each LLVM assembly file, and gccas is basically an
optimizing assembler, it runs the optimizations listed in
llvm/tools/gccas/gccas.cpp which you can inspect.

Once all the files for a program are compiled to bytecode, they are
linked with gccld, which is an optimizing linker, which does a lot of
interprocedural optimization, and creates the final bytecode file.

After this, you can use llc or lli (JIT) on the resulting bytecode, and
llc or lli don't have to do any optimizations, because they have already
been performed.

If it does indeed optimize the input, does it attempt to do global
optimizations on the functions (intraprocedural register allocation,
inlining, whatever)?

The default register allocator in use for most platforms is a
linear-scan register allocator, and the SparcV9 backend uses a
graph-coloring register allocator. However, the JIT performs no
inlining, as mentioned above.

Does it re-do these optimizations when functions are added/ removed/
changed? Are there parameters to tune the compiler's aggressiveness?

There is a JIT::recompileAndRelinkFunction() method, but it doesn't
optimize the code.

Does there happen to be a C interface to the jit ? Our scheme impl
has a good FFI, but it doesn't do C++. If not, this is no big deal,
and i'll just write something myself.

No, this would have to be added.

While the sources of llvm are not that big, the project builds very
slowly into something very large. Someone already asked about what is
the minimum needed for just a JIT compiler, and I think I have a
vague idea of what needs to tweaked. However, I want to minimize the
changes I make to my llvm tree.

llvm/examples/HowToUseJIT pretty much has the minimal support one needs
for a JIT, but if you can make it even smaller, I'd be interested.

[...] configure script seems to ignore my directives. For examle, it
always builds all architectures, ...

Are you using a release or CVS version? Support for this just went into
CVS recently, so you should check it out and see if it works for you.
If you *are* using CVS, are you saying you used `configure
-enable-target=[blah]' and it compiled and linked them all? In that
case, it's a bug, so please post your results over here:

http://llvm.cs.uiuc.edu/PR518

... and it always statically links each binary.

Yes, that is currently the default method of building libraries and
tools. If you were to make all the libraries shared, you would be doing
the same linking/relocating at run-time every time you start the tool.

There is support for loading target backends, etc. from shared objects
with -load, and we may move to the model of having shared objects for
targets in the future, but at present, they are static.

Having more shared libraries may speed up link time, but I suspect will
negatively impact run time.

Maybe we can use you for a testimonial... :slight_smile:

Certainly.

> Tail Call Elimination:
>
> I've read over the "Random llvm notes", and see that you guys have
> though about this already.
>
> However, the note dates from last year, so I am wondering if there is
> an implementation in the works. If no one is working on this or is
> planning to work on this in the near future, I would be willing to
> give it a shot if I was given some direction as to where to start.

To the best of my knowledge, this has not been done and no one has
announced their intent to work on it, so if you are interested, you'd be
more than welcome to do so.

My C++ knowledge is completely non-existant, but so far I've had a
surprisingly easy time reading the source. This change seems somewhat
involved - I will have to implement different calling conventions -
ie, passing a return-address to the callee, etc. Who is the right
person to talk to abot this?

The use case scenario is usually like this:

llvm-gcc/llvm-g++ produces very simple, brain-dead code for a given
C/C++ file. It does not create SSA form, but creates stack allocations
for all variables. This makes it easier to write a front-end. We
turned off all optimizations in GCC and so the code produced by the
C/C++ front-end is really not pretty.

[ ... ]

Hi List,

I am in the preliminary stages of adding a JIT compiler to a sizable
Scheme system (PLT Scheme). The original plan was to use GNU
Lightning, but 1) it seems to be dead, and 2) LLVM has already done a
huge amount of stuff that I would have had to write (poorly) from
scratch.

Yay! A real language :slight_smile:

Explicitly managed stack frames would also be nice, but are not a
necessity unlike the mixed calling conventions and tail call
elimination. For all of you who are wondering about call/cc, we
currently implement it via stack copying (and will continue to), so I
am not worried about llvm not having a representation for
continuations.

Mixed calling conventions are on a lot of people's list of things they
want to see. The only mixed calling convention hack I know of is in the
Alpha backend to avoid indirect calls and some prologue. However this
is just a single arch's hack. Work on the general case would probably
receive a fair amount of interest and support.

JIT + Optimization interactions:
To be more precise, does the JIT perform all of the standard llvm
optimizations on the code, or does it depend on it's client to do so
himself? Are there some examples of that?

So as it stands, one should think of out JIT as something akin to the
early Java JITs: one function at a time and only one compile per
function. This is extremely primative by modern JIT standards, where a
JIT will do profiling, find hot functions and reoptimize them,
reoptimize functions when more information about the call tree is
available, have several levels of optimizations, etc.

There isn't, AFAIK, anything stopping a user of the JIT from doing much
of this work, however it would be nice to improve the JIT.

C-Interface:

Does there happen to be a C interface to the jit ? Our scheme impl
has a good FFI, but it doesn't do C++. If not, this is no big deal,
and i'll just write something myself.

No, but such bindings would be *very useful*. And since there might be
other people who need them this summer, such work might also get a lot
of help.

Size of Distro/ Compilation Speed

While the sources of llvm are not that big, the project builds very
slowly into something very large. Someone already asked about what is
the minimum needed for just a JIT compiler, and I think I have a
vague idea of what needs to tweaked. However, I want to minimize the
changes I make to my llvm tree. I know that no one can make g++ run
any faster, but part of the speed problem and resulting size of the
compilation is that the configure script seems to ignore my
directives. For examle, it always builds all architectures, and it
always statically links each binary.

See misha's comment about some new build flags. Also, things are
considerably smaller (an order of magnitude) if one makes a release
build (make ENABLE_OPTIMIZED=1)

> To the best of my knowledge, this has not been done and no one has
> announced their intent to work on it, so if you are interested,
> you'd be more than welcome to do so.

My C++ knowledge is completely non-existant, but so far I've had a
surprisingly easy time reading the source. This change seems somewhat
involved - I will have to implement different calling conventions -
ie, passing a return-address to the callee, etc. Who is the right
person to talk to abot this?

The notes you refer to belong to Chris Lattner, but you should just post
your questions on llvmdev and someone will answer them. The benefits
are that you may get your response faster than emailing someone
directly, you may get multiple perspectives, and the discussion is
archived for future LLVMers who are looking for some similar advice.

Ok, this makes sense. However, am I correct in assuming that the
interaprocedural optimizations performed in gccas will make it
problematic to call 'JIT::recompileAndRelinkFunction()' . For example,
suppose I run run some module that looks like

module a

int foo () {
...
bar()
...
}

int bar () {
...
}

through all of those optimizations. Will the result nessisarily have a
bar() function?

You are correct, it may get inlined.

If inlining is enabled, replacing bar might have no effect if it's
inlined in foo.

True.

If inlining is not enabled, are there other gotcha's like this?

There are optimizations that will remove dead code (unreachable basic
blocks, functions that are never called directly or possibly aliased,
global variables that are not referenced or possibly aliased),
optimizations that eliminate unused arguments from functions
interprocedurally, structs may be broken up into constituent elements
(scalar replacement of arguments), arguments may become passed by value
instead of by reference if it's legal to do so, etc.

These all affect the structure of the code and if you care to preserve
some elements of it, you might do well to select your own set of
optimization passes to do what you want and none of the things you
don't.

Alternatively, I *think* you can use the debugging intrinsics to "mark"
what you want to be preserved and it will pessimize the optimizations.
I am not well-versed with the debugging intrinsics, so that's a guess.
See http://llvm.cs.uiuc.edu/docs/SourceLevelDebugging.html for info.

However, let's step back for a second. I am talking about what effect
gccas/gccld will have on code generated by some front-end. Presumably,
you want to write a single stand-alone JIT that will take scheme -> LLVM
-> native code via JIT. Hence, gccas/gccld optimization selection
doesn't really apply to you. You can write your own "tool" that will
use the JIT libraries and the optimizations of your choosing, if you so
desire.

If, however, you are using the gccas/gccld + lli model, then we're
talking about two different "Modules", i.e. the one before optimization
that gccas/gccld hack on, and the one after optimization that lli
actually sees and compiles/executes.

Once the bytecode file is passed to the JIT, *that* is what I would call
the "Module being executed". So while you are in the JIT environment,
the JIT will not do any inlining. Any inling will have already been
performed. So if the JIT has no function "bar", well, it's not that it
"disappeared", it's just that in the lifetime of the JIT, it never
existed in the first place.

If there are complications like this, how much of a performance gain
do the interprocedural opts give?

I don't have any numbers for this, because the inter-procedural
optimizations are bunched in with intra-procedural ones, and we always
run them all. So we don't really have any measurements for the gain of
JUST interprocedural optimizations.

You can try turning off ALL optimizations with

  llvm-gcc -Wa,-disable-opt -Wl,-disable-opt [...]

The first will disable all gccas optimizations and the second -- gccld.
The problem is that by removing them all it REALLY pessimizes the code,
as all locals will be stack-allocated, for instance.

If inlining is really the biggest problem you're facing, there's a
-disable-inlining flag for gccld.

Also, compileAndRelink (F) seems to update references in call sites of
F. Does this mean that every function call incurs an extra 'load' , or
is there some cleverer solution?

We don't track all the call sites. Instead, what recompile and relink
does is adds an unconditional branch from the old function (in native
code) to the new one (in native code again), so what this does is add an
extra indirection to all subsequent calls to that function, but not an
extra load.

One cleverer solution would be to actually track all the call sites, but
then if recompile-and-relink is called rarely, it would be an extra
overhead, not a gain, so it would slow down the common case.

Another cleverer solution would be to overwrite the machine code
in-place with the code for the new function, but then the problem is
that we lay out the code for functions sequentially in memory so as to
not waste space, and hence, each recompilation of the function better
fit into the place of the old one, or else we might run into the code
region of the next function. This means that we then have to break up
the code region for a function into multiple code sections, possibly
quite far apart in memory, and this leads to more issues.

So we've taken the simpler approach above which works.

Finally, if I jit-compile several modules, can they reference each
other's functions? If this is answered somewhere in the docs, I
appologize.

At present, I am not quite sure that the JIT will accept two different
Modules, most tools (except the linkers) assume a single Module that is
given to them. I have not used two Modules with the JIT and I haven't
seen anyone else do that, so it maybe a limitation or it just may need
some extention to support it, I'm not sure.

Why use linear scan on X86?

I should mention that the SparcV9 is different in structure from the
other backends so the only backend that can use the graph-coloring
register allocator is the SparcV9 backend. The linear-scan was written
in a different format, and it can be used on any backend except the V9.

Also, why linear scan vs. XYZ? Because someone wrote it and contributed
it to LLVM (that someone happens to be Alkis, and he's a member of the
LLVM group). :slight_smile:

Does it have some benefits over graph-coloring?

Algorithmically, I think it's O(n) for linear scan vs. O(n^3) for graph
coloring. Specifically for LLVM, Alkis wrote a linear-scan allocator
and so we have it. Someone at a different school wrote a graph-coloring
allocator for LLVM, but they haven't contributed it back to us, so we
don't have it.

FWIW, Lal George has a paper on using graph coloring on the register
poor X86 by implicitly taking advantage of the Intel's register
mapping to emulate 32 registers. The result is between 10 and 100%
improvement on the benchmarks he ran (but the allocater is 40%
slower).

It's not that we're against graph-coloring per se, it's just that no one
has stepped up to the plate to do it and share the code with us.

I should mention that we accept patches. :slight_smile:

> llvm/examples/HowToUseJIT pretty much has the minimal support one needs
> for a JIT, but if you can make it even smaller, I'd be interested.

Sorry, what i actually meant was: what are the minimum libraries that
I have to compile in order to be able to build the HowToUseJIT (and
all the passes in gccas/ld).

The Makefiles for gccas/gccld/lli list the libraries they use in the
USEDLIBS variable. Additionally, lli uses the special meta-library
"JIT" which expands to include the generic JIT libraries and the target
backend(s) that you have selected. See llvm/Makefile.rules, search for
JIT_LIBS and read from there.

You want to take the union of these sets, and that would do it.

Yes, I just tried with cvs and It still compiles all back-ends. I'll
try it again to make sure, and then report the bug.

Instead of opening a new bug, please reopen this one:
http://llvm.cs.uiuc.edu/PR518

It's not the linking/relocating that's the problem. The problem is
that each binary winds up being rather large. However, since these
tools don't need to be distributed or compiled for my purposes, I
guess i'm not really worried about it.

Compiling optimized binaries rather than debug build would save quite a
bit of space in the binaries. Other than that, I'm not really sure,
except maybe to compile LLVM with LLVM and then use its aggressive
optimizations and dead-code elimination transformations? :slight_smile:

To the best of my knowledge, this has not been done and no one has
announced their intent to work on it, so if you are interested,
you'd be more than welcome to do so.

My C++ knowledge is completely non-existant, but so far I've had a
surprisingly easy time reading the source. This change seems somewhat
involved - I will have to implement different calling conventions -
ie, passing a return-address to the callee, etc. Who is the right
person to talk to abot this?

The notes you refer to belong to Chris Lattner, but you should just post
your questions on llvmdev and someone will answer them. The benefits
are that you may get your response faster than emailing someone
directly, you may get multiple perspectives, and the discussion is
archived for future LLVMers who are looking for some similar advice.

I agree with misha. This should definately be discussed on-list if possible.

Ok, this makes sense. However, am I correct in assuming that the
interaprocedural optimizations performed in gccas will make it
problematic to call 'JIT::recompileAndRelinkFunction()' . For example,
suppose I run run some module that looks like

...

through all of those optimizations. Will the result nessisarily have a
bar() function?

You are correct, it may get inlined.

If inlining is enabled, replacing bar might have no effect if it's
inlined in foo.

True.

Yes, this is an important point.

We build LLVM to be as modular as possible, which means that you get to choose exactly which pieces you want to put together into your program. If you're interested in doing function-level replacement, you basically have to avoid *all interprocedural optimizations*. There may be ways around this in specific cases, but anything that assumes something about a function that has been replaced will need to be updated. I don't think there is anything LLVM-specific about this problem though.

However, let's step back for a second. I am talking about what effect
gccas/gccld will have on code generated by some front-end. Presumably,
you want to write a single stand-alone JIT that will take scheme -> LLVM
-> native code via JIT. Hence, gccas/gccld optimization selection
doesn't really apply to you. You can write your own "tool" that will
use the JIT libraries and the optimizations of your choosing, if you so
desire.

Yup exactly, you get to choose exactly what you want to use :slight_smile:

If there are complications like this, how much of a performance gain
do the interprocedural opts give?

This is impossible to say: it totally depends on the program. I know that some real-world C codes are sped up by 30-50% in some cases, but others are not sped up at all. I can't say for scheme programs, but I expect that the situation would be similar.

Also, compileAndRelink (F) seems to update references in call sites of
F. Does this mean that every function call incurs an extra 'load' , or
is there some cleverer solution?

We don't track all the call sites. Instead, what recompile and relink
does is adds an unconditional branch from the old function (in native
code) to the new one (in native code again), so what this does is add an
extra indirection to all subsequent calls to that function, but not an
extra load.

One cleverer solution would be to actually track all the call sites, but
then if recompile-and-relink is called rarely, it would be an extra
overhead, not a gain, so it would slow down the common case.

Actually this is not clear, it might be a win to do this. I don't think anyone has pounded on the replace function functionality enough for this to show up though.

Another cleverer solution would be to overwrite the machine code
in-place with the code for the new function, but then the problem is
that we lay out the code for functions sequentially in memory so as to
not waste space, and hence, each recompilation of the function better
fit into the place of the old one, or else we might run into the code
region of the next function. This means that we then have to break up
the code region for a function into multiple code sections, possibly
quite far apart in memory, and this leads to more issues.

Also, if the function is currently being executed by a stack frame higher up on the stack, when we got back to that function, chaos would be unleashed :slight_smile:

Finally, if I jit-compile several modules, can they reference each
other's functions? If this is answered somewhere in the docs, I
appologize.

At present, I am not quite sure that the JIT will accept two different
Modules, most tools (except the linkers) assume a single Module that is
given to them. I have not used two Modules with the JIT and I haven't
seen anyone else do that, so it maybe a limitation or it just may need
some extention to support it, I'm not sure.

I don't think lli supports this (yet!), but there is no fundamental reason why it could not be extended, and the JIT library might already work. I'm not sure.

It's not the linking/relocating that's the problem. The problem is
that each binary winds up being rather large. However, since these
tools don't need to be distributed or compiled for my purposes, I
guess i'm not really worried about it.

Compiling optimized binaries rather than debug build would save quite a
bit of space in the binaries. Other than that, I'm not really sure,
except maybe to compile LLVM with LLVM and then use its aggressive
optimizations and dead-code elimination transformations? :slight_smile:

Like misha said, please try compiling with 'make ENABLE_OPTIMIZED=1'. This will produce files in the llvm/Release/bin directory which much smaller than the debug files (e.g. opt goes from 72M -> 4M without debug info).

-Chris

So as it stands, one should think of out JIT as something akin to the
early Java JITs: one function at a time and only one compile per
function. This is extremely primative by modern JIT standards, where a
JIT will do profiling, find hot functions and reoptimize them,
reoptimize functions when more information about the call tree is
available, have several levels of optimizations, etc.

While this is extremely primative by modern JIT standards, it is
extremely good by modern Open Source standards, so I'm quite thankfull
for it.

If no one else does, this is something I'll invenstigate in the
future.

> Does there happen to be a C interface to the jit ? Our scheme impl
> has a good FFI, but it doesn't do C++. If not, this is no big deal,
> and i'll just write something myself.

No, but such bindings would be *very useful*. And since there might be
other people who need them this summer, such work might also get a lot
of help.

I'll probably wind up writing one, and if I do, I will certainly
submit it - it seems like it should just be gluing a bunch of
functions together and proving an 'extern "C" ' decleration.

What sort of interface should such an interface provide? The simplest
is "pass-in-a-string-to-compile", but that's rather crude.

To me, the best interface is the most simple: I would suggest just wrapping the llvm classes and methods you need with simple functions, e.g.

llvm_function_new/llvm_value_set_name/llvm_executionengine_run_function, etc.

If kept simple, standardized, and generic, I think it would be very useful to people (even if incomplete). This would allow others to build on it, and we could 'ship' it as a standard llvm library.

-Chris

Well, it seems that this is actually the next thing on my to-do
list. I'll have something simple in a few days - where/how should I
post stuff to get some criticism?

Also, how are calling conventions/tail calls progressing?

LLVM list,

I bumped into Alex Friedman in the hall today and by coincidence
he mentioned that they were switching to LLVM for their PLT Scheme
JIT project. I had evaluated LLVM a few weeks ago for my own purposes,
but decided that it was too C/C++ centered and that critical features
such as tail call optimization and other stack manipulation features
were likely stagnant. So naturally I asked Alex about tail calls:

"LLVM? What are you going to do about tail calls?"

He replied that they would likely be supported soon and he pointed
me to this discussion. Naturally, this has rekindled my interest in LLVM.

So what's the ETA for tail calls?

Thanks.

I bumped into Alex Friedman in the hall today and by coincidence
he mentioned that they were switching to LLVM for their PLT Scheme
JIT project. I had evaluated LLVM a few weeks ago for my own purposes,
but decided that it was too C/C++ centered and that critical features
such as tail call optimization and other stack manipulation features
were likely stagnant. So naturally I asked Alex about tail calls:

"LLVM? What are you going to do about tail calls?"

He replied that they would likely be supported soon and he pointed
me to this discussion. Naturally, this has rekindled my interest in LLVM.

Indeed, I have been working on them recently. :slight_smile:

So what's the ETA for tail calls?

All of the LLVM pieces for it are now in place: it is just up to the target maintainers to implement the target-specific support in the codegen for it. I hope to get X86 done before the 1.5 release (due out in a couple of weeks), the other targets may not have support until the next release.

-Chris

to people (even if incomplete). This would allow others to build on it,
and we could 'ship' it as a standard llvm library.

Well, it seems that this is actually the next thing on my to-do
list. I'll have something simple in a few days - where/how should I
post stuff to get some criticism?

This list is a good place.

Also, how are calling conventions/tail calls progressing?

They are going well. All of the pieces are now in place: this is now a valid llvm snippet:

declare fastcc void %bar(int, float)
void %foo() {
         tail call fastcc void %bar(int 1, float 2.0)
         ret void
}

The remaining piece is to build the code generator components to deal with this. Unfortunately I have been context switching a lot, so I haven't made as much progress as I hoped, but I plan to get it done this week (for X86). Whether or not the other target maintainers will implement support for it for the release will be up to them. :slight_smile:

-Chris

llvm_function_new/llvm_value_set_name/llvm_executionengine_run_function,
etc.

If kept simple, standardized, and generic, I think it would be very useful
to people (even if incomplete). This would allow others to build on it,
and we could 'ship' it as a standard llvm library.

It looks like my interface will look vaguely like this. Functions like
"llvm_make_module" will take in text representations of the
module. This seems to be the fastest way of getting our scheme
implementation to talk to the jit.

This requires being able to parse strings. The LLVM 'Parser.h'
interface (and implementation) has the built in assumptions that it
will always be parsing from the file system.

Would you guys accept a patch that makes it more general (ie, parse
from file or string)? If so, what's an easy way to do it? Is it
possible to have a "FILE" struct backed by a string?

FYI, here is a sketch of what the fib example would look like:

#include "llvm_c.h"

/*
  Everything takes and recieves void * pointers. This to avoid redefining C++
  types.

  This is the C 'version' of the Fib example, at least in spirit.
*/

char* fib_function =
"int %fib (int %AnArg) {"

" EntryBlock: "
" %cond = setle int AnArg, 2"
" branch bool %cond, label return, label recur"

" return:"
" ret int 1"

" recur:"
" %sub1 = sub int %AnArg, 1"
" %fibx1 = call tail int %fib (int %sub1) "
" %sub2 = sub int %AnArg, 2"
" %fibx2 = call tail int %fib (int %sub2) "
" %result = add int %fibx1, %fibx2"
" ret int %result"
"}";

void * M = llvm_make_module_from_text (program, "test") ;

// now we want to run some optimizations on this thing.
// make a pass manager
void * PM = llvm_make_pass_manager();

// add 'target data' to module
llvm_addPass(PM, make_target_data("jit-lib",M));

llvm_addPass (PM, llvm_createVerifierPass());

// add optimization passes here
// addPass (PM, ... )

llvm_run_passes(PM,M);

// merge modules - not relevant here

// make an execution engine from the module
void * JE = llvm_make_jit_engine (M);

// get a function pointer by name. If you have several functions with the same
// name, you are out of luck
int (*fib) (int) = (int (*)(int)) llvm_get_function_pointer(JE,M,"fib");

// the above cast is probably wrong.

int result =fib (24);

llvm_function_new/llvm_value_set_name/llvm_executionengine_run_function,
etc.

If kept simple, standardized, and generic, I think it would be very useful
to people (even if incomplete). This would allow others to build on it,
and we could 'ship' it as a standard llvm library.

It looks like my interface will look vaguely like this. Functions like
"llvm_make_module" will take in text representations of the
module. This seems to be the fastest way of getting our scheme
implementation to talk to the jit.

ok

This requires being able to parse strings. The LLVM 'Parser.h' interface (and implementation) has the built in assumptions that it will always be parsing from the file system. Would you guys accept a patch that makes it more general (ie, parse from file or string)?

Yes, that's a generally useful thing to have, I'd like to see it happen if it doesn't impact the efficiency of the lexer.

If so, what's an easy way to do it? Is it possible to have a "FILE" struct backed by a string?

Hrm, I really don't know. :frowning:

FYI, here is a sketch of what the fib example would look like:

#include "llvm_c.h"

/*
Everything takes and recieves void * pointers. This to avoid redefining C++
types.

Makes sense.

This is the C 'version' of the Fib example, at least in spirit.
*/

char* fib_function =
"int %fib (int %AnArg) {"

" EntryBlock: "
" %cond = setle int AnArg, 2"
" branch bool %cond, label return, label recur"

" return:"
" ret int 1"

" recur:"
" %sub1 = sub int %AnArg, 1"
" %fibx1 = call tail int %fib (int %sub1) "
" %sub2 = sub int %AnArg, 2"
" %fibx2 = call tail int %fib (int %sub2) "
" %result = add int %fibx1, %fibx2"
" ret int %result"
"}";

Some typeos (branch -> br, 'call tail' -> 'tail call', etc) but general idea makes sense.

void * M = llvm_make_module_from_text (program, "test") ;

One of these should be fib_function right? What is the 'test' string?

// now we want to run some optimizations on this thing.
// make a pass manager
void * PM = llvm_make_pass_manager();

// add 'target data' to module
llvm_addPass(PM, make_target_data("jit-lib",M));

llvm_addPass (PM, llvm_createVerifierPass());

I'd suggest using C style llvm_create_verifier_pass consistently, but your call :slight_smile:

// add optimization passes here
// addPass (PM, ... )

llvm_run_passes(PM,M);

// merge modules - not relevant here

// make an execution engine from the module
void * JE = llvm_make_jit_engine (M);

// get a function pointer by name. If you have several functions with the same
// name, you are out of luck
int (*fib) (int) = (int (*)(int)) llvm_get_function_pointer(JE,M,"fib");

// the above cast is probably wrong.

int result =fib (24);

Looks good!

-Chris

> This requires being able to parse strings. The LLVM 'Parser.h' interface
> (and implementation) has the built in assumptions that it will always be
> parsing from the file system. Would you guys accept a patch that makes
> it more general (ie, parse from file or string)?

Yes, that's a generally useful thing to have, I'd like to see it happen if
it doesn't impact the efficiency of the lexer.

Ok, here's a patch. I added a 'parseAsmString' function to the Parser
interface. It doesn't seem to break any tests, so parsing files seems
to still work ok.

I commented out some code/global variables that don't seem to be
usefull anymore - feel free to remove those lines completely if they
are in fact not.

I havn't tested parsing strings yet. My code is extremely simple and
*should* work, but we know where that line of thinking leads. Should I
submit a test case (it would have to be a C file that links in the
parser)?

> void * M = llvm_make_module_from_text (program, "test") ;

One of these should be fib_function right? What is the 'test' string?

In the fib example, the module is named 'test', so I assumed it was
usefull for some reason.

asm_parser.diff (8.8 KB)

This requires being able to parse strings. The LLVM 'Parser.h' interface
(and implementation) has the built in assumptions that it will always be
parsing from the file system. Would you guys accept a patch that makes
it more general (ie, parse from file or string)?

Yes, that's a generally useful thing to have, I'd like to see it happen if
it doesn't impact the efficiency of the lexer.

Ok, here's a patch. I added a 'parseAsmString' function to the Parser
interface. It doesn't seem to break any tests, so parsing files seems
to still work ok.

This looks basically ok. There are minor things (";;" -> ";"), and the commented out code should be removed.

I'm concerned that this leaks the buffer created for the file, can you verify that it doesn't?

I havn't tested parsing strings yet. My code is extremely simple and
*should* work, but we know where that line of thinking leads. Should I
submit a test case (it would have to be a C file that links in the
parser)?

Sure, that sounds good. I'd definitely prefer that it be tested before it goes into CVS. Perhaps adding something to llvm/examples would be a good way to go.

One suggestion, you might change the API to be something like this:

ParseAsmString(const char *, Module *)

Where the function parses the string and appends it into the specified module. This would make self-extending code simpler (no need to parse into one module then link into the preexisting one).

-Chris

I'm concerned that this leaks the buffer created for the file, can you
verify that it doesn't?

I can verify that the functionality doesn't change. If the buffer was
leaked before, it'll be leaked now.

However, in 'Lexer.l', you have the following code:

<<EOF>> {
                  /* Make sure to free the internal buffers for flex when we are
                   * done reading our input!
                   */
                  yy_delete_buffer(YY_CURRENT_BUFFER);
                  return EOF;
                }

Which should take care of it.

> I havn't tested parsing strings yet. My code is extremely simple and
> *should* work, but we know where that line of thinking leads. Should I
> submit a test case (it would have to be a C file that links in the
> parser)?

Sure, that sounds good. I'd definitely prefer that it be tested before it
goes into CVS. Perhaps adding something to llvm/examples would be a good
way to go.

I made a 'FibInC' example that uses my little c-wrapper (which in turn
uses this). I can submit all of those when they are cleaned up (and
linking correctly - more on that later).

One suggestion, you might change the API to be something like this:

ParseAsmString(const char *, Module *)

Where the function parses the string and appends it into the specified
module. This would make self-extending code simpler (no need to parse
into one module then link into the preexisting one).

This seems reasonable, but what happens to the users of the module if
it were to be updated - for example, if I load said module in the jit,
and then add something to it.

About the linking. Is it possible (within the current makefile
framework) to create a shared library that (statically) contains some
set of llvm libraries (including the JIT, which currenlty only works
if building a tool).

I wish to be able to do the following, assuming that I named that
library LLVM_C

gcc -lc -lLLVM_C c_file_using_llvm.c -I<stuff> -L<stuff>

I *could* build everything as a shared library, and include everything
on the command line that way, but that seems a bit errorprone, not to
mention the fact that building everything with PIC takes ~3x longer.

I'm concerned that this leaks the buffer created for the file, can you
verify that it doesn't?

I can verify that the functionality doesn't change. If the buffer was
leaked before, it'll be leaked now.

However, in 'Lexer.l', you have the following code:

<<EOF>> {
                 /* Make sure to free the internal buffers for flex when we are
                  * done reading our input!
                  */
                 yy_delete_buffer(YY_CURRENT_BUFFER);
                 return EOF;
               }

Which should take care of it.

Ok, that makes sense.

I made a 'FibInC' example that uses my little c-wrapper (which in turn
uses this). I can submit all of those when they are cleaned up (and
linking correctly - more on that later).

cool, ok.

One suggestion, you might change the API to be something like this:

ParseAsmString(const char *, Module *)

Where the function parses the string and appends it into the specified
module. This would make self-extending code simpler (no need to parse
into one module then link into the preexisting one).

This seems reasonable, but what happens to the users of the module if
it were to be updated - for example, if I load said module in the jit,
and then add something to it.

This should be fine. The JIT won't touch a function until it is called.

About the linking. Is it possible (within the current makefile
framework) to create a shared library that (statically) contains some
set of llvm libraries (including the JIT, which currenlty only works
if building a tool).

I don't know. It seems possible: each library can be built as a .so file individually. Each directory can also be relinked into a single .o file. It seems logical that you could take these .o files and make a .so file :slight_smile:

I wish to be able to do the following, assuming that I named that
library LLVM_C

gcc -lc -lLLVM_C c_file_using_llvm.c -I<stuff> -L<stuff>

I *could* build everything as a shared library, and include everything
on the command line that way, but that seems a bit errorprone, not to
mention the fact that building everything with PIC takes ~3x longer.

This should be possible, though our current makefile system is not going to do this automatically right now.

-Chris

> This seems reasonable, but what happens to the users of the module if
> it were to be updated - for example, if I load said module in the jit,
> and then add something to it.

This should be fine. The JIT won't touch a function until it is called.

What if you read in a function with the same signature? Will it just
replace it?

> About the linking. Is it possible (within the current makefile
> framework) to create a shared library that (statically) contains some
> set of llvm libraries (including the JIT, which currenlty only works
> if building a tool).

I don't know. It seems possible: each library can be built as a .so file
individually. Each directory can also be relinked into a single .o file.
It seems logical that you could take these .o files and make a .so file :slight_smile:

> I wish to be able to do the following, assuming that I named that
> library LLVM_C
>
> gcc -lc -lLLVM_C c_file_using_llvm.c -I<stuff> -L<stuff>
>
> I *could* build everything as a shared library, and include everything
> on the command line that way, but that seems a bit errorprone, not to
> mention the fact that building everything with PIC takes ~3x longer.

This should be possible, though our current makefile system is not going
to do this automatically right now.

Can I add submit a patch to do something like this (if it doesn't
disturb the makefile system that much)?

Maybe if one is building a library and sets a "LINK_IN_LIBS" flag or
something? I am guessing there might be other people that want
something like this - ie, anybody building a shared library that they
want to use in another application.

This seems reasonable, but what happens to the users of the module if
it were to be updated - for example, if I load said module in the jit,
and then add something to it.

This should be fine. The JIT won't touch a function until it is called.

What if you read in a function with the same signature? Will it just
replace it?

No, the caller will have to be careful to remove the body of the old function from the module. The asmparser should be fine with pre-existing prototypes though.

I *could* build everything as a shared library, and include everything
on the command line that way, but that seems a bit errorprone, not to
mention the fact that building everything with PIC takes ~3x longer.

This should be possible, though our current makefile system is not going
to do this automatically right now.

Can I add submit a patch to do something like this (if it doesn't
disturb the makefile system that much)?

In the abstract, yes, of course :slight_smile: I'll let Reid deal with the details though :slight_smile:

Maybe if one is building a library and sets a "LINK_IN_LIBS" flag or
something? I am guessing there might be other people that want
something like this - ie, anybody building a shared library that they
want to use in another application.

Makes sense to me!

-Chris