[<<] [<] Page 1 of 1 [>] [>>] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: "Scott Dattalo" ####@####.#### Date: 2 Mar 2007 21:14:40 +0000 Message-Id: <61451.71.139.37.215.1172870052.squirrel@ruckus.brouhaha.com> On Fri, 2007-03-02 at 12:47 -0600, David Barnett wrote: > I've always been frustrated about some big optimizations that can only > be made manually; the linker doesn't seem to have enough information > to optimize them. HI-TECH will avoid these issues by using > "omnicode", meaning all source files are compiled at the same time, > and SourceBoost seems to skirt the problem with a custom object file > format. The two that stand out to me are: > a) code-flow-related optimizations on page and bank selections > b) removal of unused functions and global variables > <snip> David, I agree with you fully (for whatever that's worth). In my opinion, the linker should be capable of Bank selection, Simulator assertions for banks, tail optimization, dead code elimination, peephole optimization, etc. and much more. I suggested at one point that the pCode optimizer I started in SDCC should be ported into gputils. In it's first implementation, it did much of the stuff we'd want. The pCode optimizer was written before the linker was used too much. So consequently, it dealt with single C-files at a time. When linker support became more prevalent, the pCode optimizer had to be changed quite a bit. The only issue I have is time... Scott | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: "David Barnett" ####@####.#### Date: 2 Mar 2007 21:51:05 +0000 Message-Id: <0b0001c75d14$8667ff00$2001a8c0@barnett2> ----- Original Message ----- From: "Scott Dattalo" ####@####.#### To: ####@####.#### Sent: Friday, March 02, 2007 3:14 PM Subject: Re: [gnupic] Linker Optimizations > On Fri, 2007-03-02 at 12:47 -0600, David Barnett wrote: >> I've always been frustrated about some big optimizations that can only >> be made manually; the linker doesn't seem to have enough information >> to optimize them. HI-TECH will avoid these issues by using >> "omnicode", meaning all source files are compiled at the same time, >> and SourceBoost seems to skirt the problem with a custom object file >> format. The two that stand out to me are: >> a) code-flow-related optimizations on page and bank selections >> b) removal of unused functions and global variables >> > > <snip> > > David, > > I agree with you fully (for whatever that's worth). In my opinion, the > linker should be capable of Bank selection, Simulator assertions for > banks, tail optimization, dead code elimination, peephole optimization, > etc. and much more. Well, I'm glad to hear you agree. I figured this wouldn't be a priority for anyone else. > I suggested at one point that the pCode optimizer I started in SDCC should > be ported into gputils. In it's first implementation, it did much of the > stuff we'd want. The pCode optimizer was written before the linker was > used too much. So consequently, it dealt with single C-files at a time. > When linker support became more prevalent, the pCode optimizer had to be > changed quite a bit. This sounds very interesting and ambitious. I'm not quite sure what constitutes pCode, though. For SDCC, that's what gets passed from the front end to the back end, right? I ask because I wasn't aware you could extract pCode from pure assembly code. As I mentioned in the last email, there's some flow information that you could extract from C, but I don't think from asm, that would be necessary for a lot of the optimizations. I'd love to see any information you have about that suggestion to port the pCode optimizer. It looks like gputils may be the best place for me to get started helping out. My biggest ambition is to make some radical improvements to SDCC, eventually. > The only issue I have is time... And nobody blames you there. You've put a ton of time into gnupic projects (Thanks!). What you need is more of other people's time =). David | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: Scott Dattalo ####@####.#### Date: 5 Mar 2007 17:46:23 +0000 Message-Id: <45EC5728.2070100@dattalo.com> David Barnett wrote: > I'm not quite sure what constitutes pCode, though. pCode is a term I came up with. C-compilers (and SDCC included) have an intermediate code generation stage where the code there is called 'iCode'. Prior to the PIC backend, the other SDCC ports went straight from iCode to assembly mnemonics. Once the assembly mnemonics have been generated, then the only optimization available was a rather limited peep hole optimizer. When I added the first PIC port to SDCC, I saw that much further PIC-specific optimizations were possible. One approach I could have taken was to write a subsequent intermediate code optimizer. There was some precedence for this already with port- specific memory and register allocation generation stages. However, I saw further potential with an architecture specific optimizer. In a nutshell, the pCode is a one-to-one mapping of an object to a PIC instruction. On the PIC backend, SDCC converts the iCode directly into pCode. If no further optimizations are applied, then the pCode can be converted in text and would be similar to the other ports. However, with the pCode representation it is possible (or at least easier) to apply optimizations. For example, I rewrote the peephole optimizer for the PIC backend to work with wild cards. This made it possible to write categories of rules as opposed to whole bunch of specific, special case rules. But in general - each pass through one of the optimization stages will convert pCode into potentially better optimized pCode. And since the pCode has a one-to-one mapping onto PIC instructions, the pCode optimizer is working much like a PIC firmware engineer. A few other features of the pCode are tail optimizations, live range analysis, and call tree analysis. These features don't make much sense living in the compiler. It's best to apply these optimizations in the linker. Take the call tree analysis. If the linker performs a call tree analysis and determine that certain functions are never called - then no code needs to be generated for them. Or if it determines that a certain function is only called once, then that function can be inlined. What's really cool is that after inlining you can run the code through the pCode optimizer again. This subsequent pass may end up removing things like parameter passing. You may wish to compare SDCC's approach with gcc. gcc has an intermediate code generation that's represented in a language they call RTL - Register Transfer Language. RTL is a Lisp like language. gcc is a multi-pass compiler. Each pass applies some specific optimization to the RTL. There are ways of writing RTL for specific passes. But the general idea is to apply some kind of optimization strategy to the RTL and run through the RTL objects. The RTL objects either do nothing or convert existing RTL objects into new ones. SDCC's iCode is somewhat like RTL. However, gcc keeps the representation in RTL as long as possible. There is not necessarily a one-to-one mapping between an RTL object and a single assembly instruction. If we were to move pCode into gplink, then this is probably the high level approach that should be taken. - define a set of assembler directives that can control the linker. For example, we'd need to tell the linker things like 'the follow block is really data table and not instructions' or 'the following code has been hand crafted to run at a very specific rate and should not be optimized' or etc... - Introduce an pCode as a do-nothing intermediate stage. It'll take time to modify gplink's infrastructure to accommodate something like pCode. I suspect there are several built in assumptions that need to dealt with. The 'do nothing' pCode provides a type of invariance stage where you can resolve these assumptions before introducing any of the pCode features. - Introduce peep hole optimization Once the code inside the linker is in a pCode format, then the simplest thing to try is the peephole optimizer. It's going to take several man-months to accomplish these high level tasks. Scott | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: "David Barnett" ####@####.#### Date: 5 Mar 2007 18:26:34 +0000 Message-Id: <0bde01c75f53$64887fe0$2001a8c0@barnett2> ----- Original Message ----- From: "Scott Dattalo" ####@####.#### To: ####@####.#### Sent: Monday, March 05, 2007 11:45 AM Subject: Re: [gnupic] Linker Optimizations > David Barnett wrote: >> I'm not quite sure what constitutes pCode, though. <snip> > A few other features of the pCode are tail optimizations, live range > analysis, and call tree analysis. These features don't make much sense > living in the compiler. It's best to apply these optimizations in the > linker. Take the call tree analysis. If the linker performs a call tree > analysis and determine that certain functions are never called - then no > code needs to be generated for them. Or if it determines that a certain > function is only called once, then that function can be inlined. What's > really cool is that after inlining you can run the code through the pCode > optimizer again. This subsequent pass may end up removing things like > parameter passing. Just these optimizations would be worth all the effort. I'm used to working with an 8-level stack and I've had to explicitly inline all kinds of single-call functions just so that I could modularize the code. This approach led to strange header file dependencies since the body has to be defined at each inline declaration, which in turn led to some severe headaches. <snip> > If we were to move pCode into gplink, then this is probably the high level > approach that should be taken. > > - define a set of assembler directives that can control the linker. > For example, we'd need to tell the linker things like 'the follow block is > really data table and not instructions' or 'the following code has been > hand crafted to run at a very specific rate and should not be optimized' > or etc... Good point. I think we'll also want to make sure that most of the optimization needs to be explicitly enabled for backwards-compatibility. <snip> > It's going to take several man-months to accomplish these high level > tasks. Sounds fun =). If those function optimizations will work, it'll solve a lot of problems for me, so I'll definitely keep it in mind to hack at when I get some free time. David | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: "David Barnett" ####@####.#### Date: 5 Mar 2007 18:59:21 +0000 Message-Id: <0bee01c75f57$ee967620$2001a8c0@barnett2> ----- Original Message ----- From: "David Barnett" ####@####.#### To: ####@####.#### Sent: Friday, March 02, 2007 3:48 PM Subject: Re: [gnupic] Linker Optimizations ... > I ask because I wasn't aware you could extract pCode from pure assembly > code. As I mentioned in the last email, there's some flow information > that you could extract from C, but I don't think from asm, that would be > necessary for a lot of the optimizations. ... > David I think these concerns of mine might be significant, so I'll elaborate a bit. For one thing, it's not always a straight-forward task to determine where a function begins or ends. Often the programmer manually optimizes a tail call to a goto, and there may be several exit points including computed gotos. That means that the exit points may be determined by the flow of execution, but there are all sorts of runtime complications to consider. The selected bank will not be known at the entry point(s), nor will any other registers (except possibly PCLATH). The bank selection bits or PC registers (or even FSR) might be modified through INDF, and all that could lead to computed gotos with targets either internal or external to a function. On the other hand, I guess things could probably just as complicated with C source code because, as a trivial example, it could include inline assembly code with the same issues. If we can solve those problems from the assembler/linker end, that will cut out all of the concerns I had with getting the information from the compiler to the linker. Still, it seems like quite a challenge to me. Do you think it's manageable? David | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: Scott Dattalo ####@####.#### Date: 5 Mar 2007 21:45:49 +0000 Message-Id: <45EC8F3F.7040102@dattalo.com> David Barnett wrote: > I think these concerns of mine might be significant, so I'll elaborate > a bit. > > For one thing, it's not always a straight-forward task to determine > where a function begins or ends. Often the programmer manually > optimizes a tail call to a goto, and there may be several exit points > including computed gotos. That means that the exit points may be > determined by the flow of execution, but there are all sorts of > runtime complications to consider. The selected bank will not be known > at the entry point(s), nor will any other registers (except possibly > PCLATH). The bank selection bits or PC registers (or even FSR) might > be modified through INDF, and all that could lead to computed gotos > with targets either internal or external to a function. One of the risks you'll face is that subtle source code changes have drastic impact on code generation. For example, adding a second call to some function may prohibit it from being inlined which in turn causes the 8-level stack to overflow. I suppose if you're counting on this type of optimization, then perhaps you'll need to ensure your assumptions are validated! (For example, you might need to create simulation regression tests.) > On the other hand, I guess things could probably just as complicated > with C source code because, as a trivial example, it could include > inline assembly code with the same issues. If we can solve those > problems from the assembler/linker end, that will cut out all of the > concerns I had with getting the information from the compiler to the > linker. Still, it seems like quite a challenge to me. Do you think > it's manageable? There's no doubt that adding an optimizer to the linker will be a challenging job. As far as being manageable, I guess that depends on who is managing! If you're interested, then I'd suggest diving into gputils/gputils/gplink/ and try getting a basic understanding on how things are structured. In particular, try to understand how program memory and register memory are dealt with. See if it's possible to annotate the preexisting structures with new information you can use for optimization. After that, check out sdcc/sdcc/src/pic/ and attempt to get some basic understanding on how the pcode stuff is used. Scott | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: Scott Dattalo ####@####.#### Date: 5 Mar 2007 23:54:08 +0000 Message-Id: <45ECAD60.80208@dattalo.com> Michael Ambrus just posted this on the SID mailing list: http://www.iecc.com/linker/ This is an online copy of a book "Linkers and Loaders" by John R. Levine. The book's web site is http://linker.iecc.com/ Chapter 11 talks about advanced linker techniques like global optimization. Scott | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: Ralph Corderoy ####@####.#### Date: 6 Mar 2007 10:59:49 +0000 Message-Id: <20070306105917.B491614A8B8@blake.inputplus.co.uk> Hi Scott, > Michael Ambrus just posted this on the SID mailing list: > > http://www.iecc.com/linker/ > > This is an online copy of a book "Linkers and Loaders" by John R. > Levine. The book's web site is It's a good book, I gave a little feedback on a draft IIRC. Regarding great optimisation, people interested in compilers should look at LLVM, http://llvm.org/ It's outstanding the optimisations they've achieved, e.g. a malloc'd linked-list of structs might be turned into one vector per struct member. This removes the per-element malloc overhead, and gives better CPU cache performance. This isn't a dumb translation; it only does it if there's benefits. Cheers, Ralph. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: David ####@####.#### Date: 9 Mar 2007 23:20:25 +0000 Message-Id: <20070309171717.0a9b0a3d@DEEPTHOUGHT.BARNET.net> On Mon, 05 Mar 2007 13:44:31 -0800 Scott Dattalo ####@####.#### wrote: > One of the risks you'll face is that subtle source code changes have > drastic impact on code generation. For example, adding a second call > to some function may prohibit it from being inlined which in turn > causes the 8-level stack to overflow. I suppose if you're counting on > this type of optimization, then perhaps you'll need to ensure your > assumptions are validated! (For example, you might need to create > simulation regression tests.) Most of the functions I'm dealing with are fundamentally designed to be called from only one location. They're things like a "button machine" function to be run from a "main loop". I know there are lots of other scenarios where small changes have big effects, though. Hopefully in time we can make the optimizer smart enough not to get tripped up with some of the more trivial things. > There's no doubt that adding an optimizer to the linker will be a > challenging job. As far as being manageable, I guess that depends on > who is managing! If you're interested, then I'd suggest diving into > gputils/gputils/gplink/ and try getting a basic understanding on how > things are structured. In particular, try to understand how program > memory and register memory are dealt with. See if it's possible to > annotate the preexisting structures with new information you can use > for optimization. FYI, I've been making my way through the headers for gplink and digging through some of the code. It's pretty simple and concise for the most part. I haven't seen anything dealing with memory or registers yet, though. > After that, check out sdcc/sdcc/src/pic/ and > attempt to get some basic understanding on how the pcode stuff is > used. I glanced over that and it made me dizzy =). I think I'll spend some more time going through gplink before getting back to that. BTW, how's your other project coming along? David | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
Re: [gnupic] Linker Optimizations
From: Scott Dattalo ####@####.#### Date: 10 Mar 2007 00:13:43 +0000 Message-Id: <45F1F7F8.8000004@dattalo.com> David wrote: > FYI, I've been making my way through the headers for gplink and digging > through some of the code. It's pretty simple and concise for the most > part. I haven't seen anything dealing with memory or registers yet, > though. > Craig Franklin writes very clean code. Take a look at libgputils/ and in particularly at gpmemory.c >> After that, check out sdcc/sdcc/src/pic/ and >> attempt to get some basic understanding on how the pcode stuff is >> used. >> > I glanced over that and it made me dizzy =). I think I'll spend some > more time going through gplink before getting back to that. > Craig Franklin writes cleaner code than I do! :) The pCode structures are a lot like C++ classes. > BTW, how's your other project coming along? > If you're referring to the Graphic Display based on the SSD0323, then I can report that I've got all of the interfaces working. Initially there was only the '8080' mode, now there's the 6800 mode and the SPI mode. By the way, the SSP code Roy Rankin added a year ago works perfectly (as far as I can tell). I written a sample .asm file that tests all three modes. I haven't implemented all of the high level functionality of the SSD0323. But I do support the essential commands like set row and column ranges. This work is for a side contract job. That job will probably eat into my gpsim development time (but the pay is better!). The other project that I haven't made any significant progress on over the last few weeks is the multiple processor feature. I've made a few really minor tweaks to the symbol table code (I rewrote the symbol table code as part of the multiple processor effort). Scott | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[<<] [<] Page 1 of 1 [>] [>>] |