1) From what you wrote, basically I should not bother too much with the speed issues, since every machine has it differently implemented, correct? Anyway would you recommend some document where is clock cycles or whatever speed specification of the instruction? Maybe to learn whether pxor is faster than movq on mmx... 2) MMX<->XMM still should be faster than from memory on every machine, or not? 3) So since the pitch is mod16, basically it is safe (and good idea???) to run one cycle from 0 to pitch*height, instead of two cycles inside each other for height and width? 4) Saturation means 250+10=255? 5) Should I be interested in MOVNTDQ? I don't know what is the benefit of nontemporal
So basically if I understand it correctly (what latency and rec thruput means) I made a little breakdown of my code. What it does is load data from memory to mm0, mm2, interleaves with zeros, multiplies and adds to mm4,mm5 The expressions in comment means: clock for operation: reciprocial thru / latency --> clock when data ready Code:
--------------------- http://pl.youtube.com/user/3dana - moje archiwum i profil na Youtube
I refuse to try to read stuff that is wider than the screen. Edit your post to collapse the tabs and I might then take an interest. :devil: Thanks, it make some sense when you can see the whole picture. ;)
And then jnz i suppose, Thanx I have one little different question, about calling the filter. Atm, this filter requires variable amount of arguments. (Clip, float, clip, float, clip float,...) I want to add more optional named arguments : y,u,v integers (obviously for plane processing) bias float How can I detect what in fact are the argument? FYI, the code bellow how it works now, but do understand that I did't create it, I used and modified from somewhere else. Code:
Well, after one day figuring out that it does not work coz pmulhw is signed, I think I succeeded in some speedup, but hard to tell because I measure it in TaskManager on CPU Time and it varies+-10%...I don't know, maybe caching situation. I found somewhere cycles.h and I guess it is something I would might need to measure... but I get error message 1>C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\Cycles.h(201): warning C4405: 'ret' : identifier is reserved word 1>C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\Cycles.h(463): error C3861: 'elapsed': identifier not found Anyone has experience with this and may have advice?
--------------------- 1 władza w Polsce - Telewizyjne Serwisy Informacyjne
As others point out using ".*" in your argument list overrides type checking of the arguments and the number of arguments. So your creator code must validate what it is expecting.