Hacking function pointers in HLSL. The principle is very simple: Instead of using interface and classes as global variables, we can in fact use ...
Updateyourbookmarks!Thisblogisnowhostedonhttp://xoofx.com/blog
Thursday,November24,2011
AdvancedHLSLusingclosuresandfunctionpointers
ShaderlanguageslikeHLSL,CgorGLSLarenowadaysdrivingthemostpowerfulprocessorsintheworld,butifyouaredevelopingwiththem,youmayhavebeenalreadyalittlebitfrustratedbyoneoftheirexpressivenesslimitations:thecommonproblemofabstractionandcodereuse.Inordertoovercomethisproblem,solutionssofarweremostlyusingagluecombinationof#define/#includepreprocessorsdirectivesinordertogeneratecombinationsofcode,permutationofshaders,socalledUberShaders.Recently,thisproblemhasbeenaddressed,forHLSL(newinDirect3D11),byprovidingtheconceptofDynamicLinking,andforGLSL,theconceptofSubRoutines,ForDirect3D11,thenewmechanismhasbeenonlyavailableforShaderModel5.0,meaningthatevenifthiscouldgreatlysimplifiedtheproblemofabstraction,ItisunfortunatelyonlyavailableforDirect3D11classgraphicscard,whichisofcourseahugelimitation...
But,hereisthegoodnews:Whiletheclassicusageofdynamiclinkingisnotreallypossiblefromearlierversion(likeSM4.0orSM3.0),IhavefoundaninterestinghacktobringsomekindofclosuresandfunctionspointerstoHLSL(!).Thissolutiondoesn'tinvolveanykindofpreprocessingdirectiveandisabletoworkwithSM3.0andSM4.0,soItmightbeinterestingforfolkslikemethatliketoabstractandreusethecodeasoftenaspossible!Butlet'sseehowItcanbeachieved...
AsimpleproblemofabstractionandcodereuseinHLSL
IhavebeenworkingrecentlyatmyworkonaGPUimplementationofaversatileperlin/simplex/fbm/turbulencenoiseinHLSL.Whilesomeoftheindividualalgorithmareprettysimples,itisoftencommontouseseveralpermutationsofthosefunctionsinordertoproducesomenicenoiseandturbulencesfunctions(liketheworm-lavatextureIdidforErgon4kintro).Thus,theyareanidealcandidatetodemonstratetheuseofclosuresandfunctionspointers.Iwon'texplainherethebasicprincipleofperlinandfbmnoisegenerationtofocusontheproblemofcodereuseinHLSL.
HereisasimplifiedversionofaTurbulenceNoiseimplementedinaPixelShader:
floatPerlinNoise(float2pos){
....
}
floatAbsNoise(float2pos){
returnabs(PerlinNoise(pos));
}
floatFBMNoise(float2pos){
floatvalue=0.0f;
floatfrequency=InitialFrequency;
floatamplitude=1.0f;
//ClassicFBMloop
for(inti=0;iPerlinNoise
floatFbmPerlinNoise2DPS(float4pos:SV_POSITION,float2texPos:TEXCOORD0)
:SV_Target
{
//Look!Wearedeclaringalocalclass
classNoise1:PerlinNoise{}noise1;
//andthislocalclassscanaccesslocalvariable!
//Forexample,Noise2canaccesspreviousnoise1variable.
classNoise2:FbmNoise{INoiseNext(){returnnoise1;}}noise2;
//Allowingustocascadethecallsandmakingakindofdeferredcomposition.
returnnoise2.Compute(texPos);
}
//Fbm->Abs->PerlinNoise
floatFbmAbsPerlinNoise2DPS(float4pos:SV_POSITION,float2texPos:TEXCOORD0)
:SV_Target
{
classNoise1:PerlinNoise{}noise1;
classNoise2:AbsNoise{INoiseNext(){returnnoise1;}}noise2;
classNoise3:FbmNoise{INoiseNext(){returnnoise2;}}noise3;
//FbmNoiseiscallingindirectlyAbsNoisethatwillcallPerlinNoise.
returnnoise3.Compute(texPos);
}
//Marble->Fbm->Abs->PerlinNoise
floatFbmAbsPerlinNoise2DPS(float4pos:SV_POSITION,float2texPos:TEXCOORD0)
:SV_Target
{
classNoise1:PerlinNoise{}noise1;
classNoise2:AbsNoise{INoiseNext(){returnnoise1;}}noise2;
classNoise3:FbmNoise{INoiseNext(){returnnoise2;}}noise3;
classNoise4:MarbleNoise{INoiseNext(){returnnoise3;}}noise4;
//MarbleNoiseiscallingFbmNoisethatiscallingindirectlyAbsNoise
//thatwillcallPerlinNoise.
returnnoise4.Compute(texPos);
}
//Fbm->Marble->Abs->PerlinNoise
floatFbmAbsPerlinNoise2DPS(float4pos:SV_POSITION,float2texPos:TEXCOORD0)
:SV_Target
{
classNoise1:PerlinNoise{}noise1;
classNoise2:AbsNoise{INoiseNext(){returnnoise1;}}noise2;
classNoise3:MarbleNoise{INoiseNext(){returnnoise2;}}noise3;
classNoise4:FbmNoise{INoiseNext(){returnnoise3;}}noise4;
//FbmNoiseiscallingMarbleNoisethatiscallingindirectlyAbsNoise
//thatwillcallPerlinNoise.
returnnoise4.Compute(texPos);
}
Etvoila!Asyoucansee,weareabletodeclarelocalclassesfromapixelshaderthatareactingasclosures.ItisforexampleevenpossibletodeclarelocalclassesthathaveaspecificcodeintheirCompute()methods.
Behindthescene,whenchainingtheINoise::Next()methods,thefxcHLSLcompilerisseeingallthosesclassesas"INoise*".
Itisthenpossibletoperformafbm(marble(abs(perlin_noise())))aswellasamarble(fbm(abs(perlin_noise()))).
Intheend,ItiseffectivelypossibletoimplementclosuresinHLSLthatcanbeusedinSM4.0aswellasSM3.0!
Improvingclosureschaining
Fromthepreviousexample,wecanextendtheconceptby
1.AddingstaticlocalconstructorstoeachNoisefunction:
//PerlinNoiseimplem
classPerlinNoise:NoiseBase{
floatCompute(float2pos){
//callastandardperlin_noiseimplementedasasimpleexternalfunction
returnperlin_noise(pos);
}
//Addlocal"constructor"
staticINoiseNew(){
PerlinNoisenoise;
returnnoise;
}
};
//AbsNoiseimplem
classAbsNoise:NoiseBase{
floatCompute(float2pos){
//Note:WeareusingNexttoaccessthenextunderlyingfunctionpointer
returnabs(Next().Compute(pos));
}
//AddlocalconstructorandchainwithFromINoise
staticINoiseNew(INoisefrom){
classLocalNoise:AbsNoise{INoiseNext(){returnfrom;}}noise;
returnnoise;
}
};
//AddthesameconstructorstoFbmNoiseandMarbleNoise.
//....
2.AndthenwecanrewritethePixelshaderfunctionstochainoperatorsinashorterform:
//Fbm->Marble->Abs->PerlinNoise
floatFbmAbsPerlinNoise2DPS(float4pos:SV_POSITION,float2texPos:TEXCOORD0)
:SV_Target
{
//FbmNoiseiscallingMarbleNoisethatiscallingindirectlyAbsNoise
//thatwillcallPerlinNoise.
returnFbmNoise::New(MarbleNoise::New(AbsNoise::New(PerlinNoise::New()))).Compute(texPos);
}
Thisway,Itallowsasyntaxthatisevenmoreconciseandmodular!
FurtherConsiderations
ThisisaveryexcitingtechniquethatcouldopenlotsofabstractionopportunitieswhiledevelopinginHLSL.Though,inordertousethistechnique,thereareacoupleofadvantagesandthingstotakeintoaccount:
Aninterfacecannotinheritfromanotherinterface(thatwouldbereallyinteresting)
Aninterfacecanonlyhavemethodmembers.
Aclasscaninheritfromanotherclassandfromseveralinterfaces.
UnlikeinC/C++,wecannotpre-declareaninterface,butwecanuseadeclarationbeingdeclared(SeetheexampleofthemethodINoise::Next,returningaINoise).
Thecompilerhasalimitationagainstthereuseofanimplementationinacallchainandwillcomplainaboutarecursivecall(evenifthereisnorecursivecallatall):Forexample,Itisnotpossibletoreusetwicethesampletypeofclassclosureinacallchain,meaningthatitisnotpossibletomakeacallchainlikethisone:Marble=>FBM=>Marble=>Abs=>Perlin.Thefxccompilerwouldcomplainaboutthesecond"Marble"asItwouldseeitasakindofrecursivecall.Inordertoreuseafunction,weneedtoduplicateit,that'sprobablytheonlypointthatisannoyinghere.
Generatedcompiledasmoutputfromclosuresareexactlythesameasusingstandardinliningmethods.
Beforegoingtolocalclass-closure,Ihavetriedseveraltechniquesthatweresometimescrashingfxccompiler.
Thus,asitisawayofhackingtheusageHLSL,Itisnotguaranteethatthiswillbesupportedinthefuture.Butatleast,ifitisworkingforSM5.0,SM4.0and3.0,wecanexpectthatwearesafeforawhile!
Also,thecompilationtimeundervs_3_0/ps_3_0profileseemstotakemoretime,notsureifitsthelanguageconstructionoraregularbehaviorof3.0profiles.
LetmeknowifyouareabletousethistechniqueandIfyouarefindingotherinterestingconstructionsorproblems.Thatwouldbeveryinterestingtodigalittlemoreintotheopportunitiesitopens.Lastly,Ihavedoneasmallgooglesearchaboutthiskindoftechnique,butdidn'tfoundanything...butItcouldhavebeenusedalreadybysomeoneelse,thusthiswholetechniqueisanewhypotheticaldiscovery,butIenjoyedalottodiscoverit!
Postedby
xoofx-AlexandreMUTEL
at
11:09PM
Labels:
D3D10,
D3D11,
DirectX,
hlsl,
shader
4comments:
RCalocaNovember30,2011at4:55AMVeryclever!NowontotryitonCg:)ReplyDeleteRepliesReplydiabolFebruary9,2012at3:48AMAtfirstIthought:mixinginterfacesandabstractclasses!=closures,thenreadonandithitme.Iapprove.ReplyDeleteRepliesReplyJohanVerweyNovember15,2012at12:18AMYoucandeclareanewclasswithinafunction!?Veryniceindeed!Ididn'tknowHLSLallowedthis.Ihavebeenworkingonadifferentapproachforusinghigherorderfunctions.AlibraryIamworkingontranslatesF#toHLSL.Checkitout:https://github.com/rookboom/SharpShaders/wiki/Higher-order-functionsReplyDeleteRepliesReplyUnknownApril26,2013at7:36AMLovethis.I'musingitinmydeferredshadingengine,toselectthematerialsbasedonanindex,soidon'thavetolooptrougheachoption.ReallyniceindeedReplyDeleteRepliesReplyAddcommentLoadmore...
Commentsaredisabled
Note:Onlyamemberofthisblogmaypostacomment.
NewerPost
OlderPost
Home
Subscribeto:
PostComments(Atom)
Abouttheauthor
Unknown
xoofx-AlexandreMUTEL
OpenSourceProjects
SharpDX
NShader
AsmHighlighter
NKnobMan
BoostWave.Net
NRenoiseTools
NetAsm
NAudio
Twitter
Tweetsby@xoofx
DemocodingBlogs
Therygblog
EntropydecodinginOodleData:HuffmandecodingontheJaguar
2monthsago
Ctrl-Alt-Test
ArtworkfortheTokyoDemoFestlightningtalks
1yearago
BlockoS
FredAstaire’sDoppelgänger
3yearsago
iq-strawberry
BasicVR
6yearsago
u2Bleank/TheBlog
I’mworkingonthepixelReadBackmechanismforBlackInk.It’s...
6yearsago
#ponce'sblog
Twonewwebsites
7yearsago
directtovideo
leavingsony.
8yearsago
Nervedemosystem
Twoyearslater...
9yearsago
GraphicsSizeCoding
C64CoolmazeGenerator(sortof)
9yearsago
Machinations
Inspiration
10yearsago
IN4K
Decipher-Yup
Anatole(XT95)Duprat
blackpawn'sblog
Tags
.Net
(12)
C#
(11)
D3D11
(9)
DirectX
(9)
SharpDX
(7)
Tools
(6)
VisualStudio
(6)
demoscene
(6)
D3D10
(5)
shader
(5)
x86
(5)
Direct2D
(4)
Win8
(4)
assembler
(4)
hlsl
(4)
RayMarching
(2)
SlimDX
(2)
UI
(2)
WinRT
(2)
Windows
(2)
design
(2)
msbuild
(2)
nshader
(2)
softsynth
(2)
DirectX11.1
(1)
SphereTracing
(1)
benchmark
(1)
compression
(1)
crinkler
(1)
rmasm
(1)
BlogArchive
►
2014
(5)
►
September
(2)
►
August
(1)
►
June
(1)
►
May
(1)
►
2013
(1)
►
April
(1)
►
2012
(1)
►
August
(1)
▼
2011
(4)
▼
November
(2)
AdvancedHLSLusingclosuresandfunctionpointers
Direct3D11multithreadingmicro-benchmarkinC#wi...
►
October
(1)
►
March
(1)
►
2010
(15)
►
December
(2)
►
November
(2)
►
October
(3)
►
August
(3)
►
May
(2)
►
March
(1)
►
February
(1)
►
January
(1)
►
2009
(8)
►
December
(2)
►
October
(6)
SubscribeTo
Posts
Atom
Posts
Comments
Atom
Comments