Monday, July 8, 2019

EGC - curs Seria CB, am III sem 1

seria CB : Conf. Irina Mocanu

Imagine similară


https://ocw.cs.pub.ro/courses/egc/teme/regulament
http://andrei.clubcisco.ro/cursuri/anul-3/semestrul-1/egc.html
https://docs.google.com/spreadsheets/d/1gC1GbngTekbPSD5wPfnRriou0qpSCqwBZ4tfiJhi9Vg/edit#gid=2053557808

2. EGC (Elemente de Grafica pe Calculator)

Curs: Irina Mocanu (iar!)
Laborator: Gabriel Ivanica


"OupănGele."

Curs: o groaza de teorie si explicata destul de nasol. As always, teste de prezenta...notele variaza intre 2 si 10 (2 pt muritori, 10 pt cei care citeau de acasa si copiau din curs in timpul testului - cu cateva mici exceptii). Am renuntat sa mai merg chiar daca media notelor din curs se aduna la nota din examen...nu merita sa mai stau 3 ore.

// o mica paranteza: in momentul de maxima plictiseala am instantiat un contor... care a ajuns pe la ~193. practic...mi-era teama ca face overflow.

Laboratorul: asta a fost in regula; se poate termina destul de repede dar necesita putina atentie. Sunt niste parti mai tricky (in special la 3D, partea cu camere first person / third person). In mare... trebuie urmarite niste todo-uri.

Temele: 4 teme destul de muncitoresti (contin si bonusuri). Ultima  (cu texturi) a parut ceva mai usoara - avea si punctaj mai mare. Stiu ca era la un moment dat o regula cu maxim 3 teme pe materie...but 'dem rebels. Toate trebuie prezentate in anumite saptamani la laborator (cam atunci se si corecteaza).

Examenul: comod; o foaie cu 6 subiecte, destul de simple - de dedus o matrice de proiectie, de explicat niste algoritmi din curs; lumea incepuse sa plece cu 20 de minute inainte sa se termine. Personal n-am reusit sa invat dupa pdf-urile date...mi se pare prezentata oribil materia - long live wikipedia!

 5. EGC- elemente de grafica, o materie in care inveti despre transformari geometrice, lumini, umbre, ceata, si cativa algoritmi de rasterizare pe care le fac placile video (gen cum se deseneaza un cerc din pixeli). Cursul e maxim de plictisitor, dar se ofera un bonus monstru, 1 punct pentru prezenta la curs. Temele sunt lungi, am scris in medie 800 de linii la o tema, dar se corecteaza destul de lejer, in principiu trebuie sa faci niste joculete. Laboratorul e ok, cel mai simplu lab (la unul din ele, ala cu lumini, trebuia sa scrii fix 2 randuri). Exista bonusuri si pe teme, si pe laborator. Temele sunt 4, ultima valoreaza 1.5 puncte (culmea ca ultima a fost foarte usoara), in rest fiecare are cate un punct. Labul are 1.5 puncte. Sunt multe bonusuri, erau colegi care au luat atat de mult incat daca luau 5 in examen le iesea media 10. Cu toate astea nu m-a pasionat prea tare, pentru ca mi se pare foarte foarte plictisitor sa faci programare grafica.
Cum treci: trebuie sa faci cel putin 2 teme, sau macar partial din ele. E usor de facut punctaj partial, cu teme foarte slabe, pentru ca se puncteaza foarte usor. La tema a 2-a daca desenai un grid din patrate si mai faceai sa poti colora patratele alea maro aveai deja 40 de puncte, daca tin minte, deci e usor de scos punctaje partiale. Examenul e standard, ca in alti ani. Cam toate examenele anul asta au fost fix ca in anii anteriori, mai putin la APD unde chiar poate sa pice orice algoritm.
Cum iei 10: faci toate temele - asta o sa ocupe mult timp, si mergi la curs. Inveti bine subiectele de pe exams si sigur vei lua nota mare.


1) Elemente de grafica pe calculator – EGC ; prof. Florica Moldoveanu; curs : 7 ; laborator : 9 ; examen : 7
Se face cunostinta cu doamna prodecan Florica Moldoveanu, care este o doamna in adevaratul sens al cuvantului : are o prezenta, stie sa vorbeasca. Nu este genul caterinca asa cum a fost doamna Nita. O mai puteti intalni la inca 2 materii in afara de asta daca merge pe o anumita ramura, deci parerea pe care v-o formati acuma va fi definitorie : toate materiile seamana intre ele.
Cursul te trece printr-o teorie … de grafica, cu ceva matematica prin ea si ceva geometrie, si parca mai erau si niste limite. Oricum, nu este o matematica foarte complicata. Se fac cam care ar fi transformarile pe care le faci la proiectii, transformari de rotire, scalare, translatare, cum desenezi ceva 3D avand la dispozitie 2D, si tovarasul Bezier ( care a facut o gramada de curbe la viata lui ), rasterizare ( cum desenezi ceva pana o dai in pixeli, care pixelul e ceva micut despre deosbire de ce vrei sa desenezi ).
Deci daca esti un om care te intrebi cum dq de se deseneaza toate pe ecran, cursul acesta ar trebui sa iti raspunda. Cursul se numeste “Elemente de…”, nu ?
Laboratorul l-am facut cu Victor Asavei, care este un tip grasun, cu pantaloni de stofa, adidasi, si care este printre putinii asistenti care stiu despre ce e vorba, foarte saritori , te ajuta, dau feedback. Vom vorbi mai incolo despre tot felul de personaje si lepre de care e bine sa fugiti ca dq de tamaie ( vi le zice baiatu’ pe toate ). Daca te pasioneaza grafica, e bine sa faci cu el ca afli foarte multe chestii nu numai de grafica in sine, ci si de programare. Am discutat de vreo 2 ori niste chestii prin Java si C# , niste smecherii de scris cod mai elegant, si mi s-a parut capabil. Am mai facut unele laboratoare cu Alex Egner, care era student, si care e un tip foarte scund si destept, si care explica cu o seriozitate maxima orice kkt. Super serios dar de caterinca.
Daca insa te simti gherla si nu esti interesat de materia asta, e bine sa nu faci cu el, si sa incerci sa gasesti pe cineva mai intelegator. Anca Morar este o dulcica , si eu am avut-o la laborator la o alta materie si nu am mai intrat in atatea detalii.
La laborator pornesti de la un cod imens la care tu trebuie sa adaugi maxim 50 de linii de cod ca sa faci toate taskurile. Primele 3-4 laboratoare le faci in Java, si tu daca nu stii Java, o sa te uiti ca mata-n calendar la cum au facut aia deja niste derivari de clase, niste apeluri de metode POO-ish. O sa rotesti scalezi si translatezi pana o sa iti vina acru ; cred ca in primele 4 laboratoare trebuie sa scalezi si translatezi ba un triunghi, ba un cub.In Java tot ce o sa faci este sa desenezi linii in prostie, intre puncte matematice calculate cu formule destul de mari. Daca gresesti o formula se duce dracu tot desenu ( o sa vezi la teme ).
Daca te gandesti ca poate o fi mai usor dupa Java, ei bine , nu e asa : o sa trebuiasca sa scrii si C++ pe care bineinteles ca nu il vei stii ( se foloseste OpenGL, care e facut in C++. E un engine destul de smecher de desenat ). O sa bagi C peste C++, o sa ai o gramada de warninguri.
Temele sunt in numar mare : 5 bucati . Primele 2 de Java ( si ca sa intri in examen trebuie sa faci una ) , si urmatoarele 3 de OpenGL ( si ca sa intri in examen trebuie sa faci una de aici ). Asta inseamna ca o data la 2 saptamni ai de desenat .Temele in Java e de trasat linii si sunt mai facubile. Cum suna o tema ? De exemplu am avut la tema1 sa desenam o masinuta ( un triunghi ) care se misca pe un circuit ( deci un poligon d-ala mare ). Si o contolai din taste, si daca te loveai de perete, nu mai puteai sa te misti. Deci iti dai seama cate calcule ai de facut. Dupa care la tema2 desenezi deja niste cuburi care se translateaza si scaleaza in jurul soarelui, simbolizand planete. Probabil ca o sa fi disperat cu “fereastra-poarta” , care inseamna ca trebuie sa ai un fel de zoom pe un obiect, care inseamna : ghici ce ? inca o translatare si o scalare ! Dupa care in OpenGL o sa ajungi pana la tema5  ( eu n-am ajuns pana acolo ) in care efectiv faci jocuri : Ori un joc d-ala cu o nava care trage niste obuze in nave extratereste de le omoara ( si iti dai seama cat ai de scris ca sa calculezi intersectii de proiectile si corpuri neregulate ) pana la un fel de counter strike la tema5, unde vezi first-person unul care se misca cu o arma in mana, sare pe niste scari, trage cu pusca si omoara obiecte.
Pare misto daca esti pasionat de astea. Eu nu am fost.
Examen este tipic : toata materia . Ai 8 subiecte, vei scrie cam 20-25 de pagini. Pur si simplu iti da : transformari 3d. Si tu redai din curs cuvant cu cuvant. Este unul din examenele la care am invatat cel mai mult. Nu prea sunt favorite, dar le cam repeta de la an la an. Oricum, nu esti scutit de materie de niciun fel. Daca ai sange, risca-te cu servite.
Minim efort: Faci si tu temele de java, si la opengl faci prima tema din cele 3 , ca e mai usoara. In rest 4 si 5 sunt criminale ( minim 1000 si ceva de linii ). Laboratorul cat mai lejer, si la examen servita. Cu totul iei 8.
Ce imi aduc aminte din acest curs dupa 1 an de zile : aproape nimic







Tutoriale care au stat la baza cursurilor: de la Tutorial 10 pana la 20




Background

OpenGL provides several draw functions. glDrawArrays() that we have been using until now falls under the category of "ordered draws". This means that the vertex buffer is scanned from the specified offset and every X (1 for points, 2 for lines, etc) vertices a primitive is emitted. This is very simple to use but the downside is if a vertex is part of several primitives then it must be present several times in the vertex buffer. That is, there is no concept of sharing. Sharing is provided by the draw functions that belong to the "indexed draws" category. Here in addition to the vertex buffer there is also an index buffer that contains indices into the vertex buffer. Scanning the index buffer is similar to scanning the vertex buffer - every X indices a primitive is emitted. To exercise sharing you simply repeat the index of the shared vertex several times. Sharing is very important for memory efficiency because most objects are represented by some closed mesh of triangles and most vertices participate in more than one triangle.
Here is an example of an ordered draw:



If we are rendering triangles the GPU will generate the following set: V0/1/2, V3/4/5, V6/7/8, etc.
Here is an example of an indexed draw:










In this case the GPU will generate the following triangles: V2/0/1, V5/2/4, V6/5/7, etc.
Using index draws in OpenGL requires generating and populating an index buffer. That buffer must be bound in addition to the vertex buffer before the draw call and a different API must be used.

Source walkthru

GLuint IBO;
We added another buffer object handle for the index buffer.
Vertices[0] = Vector3f(-1.0f, -1.0f, 0.0f);
Vertices[1] = Vector3f(0.0f, -1.0f, 1.0f);
Vertices[2] = Vector3f(1.0f, -1.0f, 0.0f);
Vertices[3] = Vector3f(0.0f, 1.0f, 0.0f);
To demonstrate vertex sharing we need a mesh which is a bit more complex. Many tutorials use the famous spinning cube for that. This requires 8 vertices and 12 triangles. Since I'm lazy I use the spinning pyramid instead. This requires only 4 vertices and 4 triangles and is much easier to generate manually...
When looking at these vertices from the top (along the Y axis) we get the following layout:

unsigned int Indices[] = { 0, 3, 1,
                           1, 3, 2,
                           2, 3, 0,
                           0, 1, 2 };
The index buffer is populated using an array of indices. The indices match the location of the vertices in the vertex buffer. When looking at the array and the diagram above you can see that the last triangle is the pyramid base while the other three make up its faces. The pyramid is not symmetric but is very easy to specify.
glGenBuffers(1, &IBO);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, IBO);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(Indices), Indices, GL_STATIC_DRAW);
We create and then populate the index buffer using the array of indices. You can see that the only difference in creating vertex and index buffers is that vertex buffers take GL_ARRAY_BUFFER as the buffer type while index buffers take GL_ELEMENT_ARRAY_BUFFER.
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, IBO);
In addition to binding the vertex buffer we must also bind the index buffer prior to drawing. Again, we use the GL_ELEMENT_ARRAY_BUFFER as the buffer type.
glDrawElements(GL_TRIANGLES, 12, GL_UNSIGNED_INT, 0);
We use glDrawElements instead of glDrawArrays. The first parameter is the primitive type to render (same as glDrawArrays). The second parameter is the number of indices in the index buffer to use for primitive generation. The next parameter is the type of each index. The GPU must be told the size of each individual index else it will not know how to parse the buffer. Possible values here are GL_UNSIGNED_BYTE, GL_UNSIGNED_SHORT, GL_UNSIGNED_INT. If the index range is small you want the smaller types that are more space efficient and if the index range is large you want the larger types. The final parameter tells the GPU the offset in bytes from the start of the index buffer to the location of the first index to scan. This is useful when the same index buffer contains the indices of multiple objects. By specifying the offset and count you can tell the GPU which object to render. In our case we want to start at the beginning so we specify zero. Note that the type of the last parameter is GLvoid* so if you specify anything other than zero you need to cast it to that type.
Next tutorial

Background

In the last few tutorials we have developed several transformations that give us the flexibility of moving an object anywhere in the 3D world. We still have a couple more to learn (camera control and perspective projection) but as you probably already guessed, a combination of the transformation is required. In most cases you will want to scale the object to fit your 3D world, rotate it into the required orientation, move it somewhere, etc. Up till now we have been exercising a single transformation at a time. In order to perform the above series of transformations we need to multiply the first transformation matrix by the vertex position and then multiple the next transformation by the result of the previous multiplication. This goes on until all the transformation matrices have been applied on the vertex. One trivial way to do that is to supply each and every transformation matrix to the shader and let it do all the multiplications. This, however, is very inefficient since the matrices are the same for all vertices and only vertex position changes. Luckily, linear algebra provides a set of rules that make our life easier. It tells us that given a set of matrices M0...Mn and a vector V the following holds true:
M* Mn-1 * ... * M0 * V = (Mn* Mn-1 * ... * M0) * V
So if you calculate:
N = M* Mn-1 * ... * M0
Then:
M* Mn-1 * ... * M0 * V = N * V
This means that we can calculate N once and then send it to the shader as a uniform variable where it will be multiplied in each vertex. This will require the GPU one matrix/vector multiplication per vertex.
How do you order the matrices when generating N? The first thing you need to remember is that the vector is initially multiplied by the matrix on the far right of the series (in our case - M0)Then the vector is transformed by each matrix as we travel from the right hand side to the left hand side. In 3D graphics you usually want to scale the object first, then rotate it, then translate it, then apply camera transformation and finally project it to 2D. Let's see what happens when you rotate first and then translate:








Now see what happens when you translate first and then rotate:









As you can see, it is very diffcult to set the object position in the world when you translate it first because if you move it away from the origin and then rotate it goes around the origin which actually means that you translate it again. This second translation is something you want to avoid. By rotating first and then translate you disconnect the dependency between the two operations. This is why it is always best to model around the origin as symmetrically as possible. That way when you later scale or rotate there is no side effect and the rotated or scaled object remains symmetrical as before.
Now that we are starting to handle more than one transformation in the demos we have to drop the habit of updating the matrix directly in the render function. This method doesn't scale well and is prone to errors. Instead, the pipeline class is introduced. This class hides the fine details of matrix manipulation under a simple API to change the translation, rotation, etc. After setting all the parameters inside it you simply extract the final matrix that combines all the transformation. This matrix can be fed directly into the shader.

Source walkthru

#define ToRadian(x) ((x) * M_PI / 180.0f)
#define ToDegree(x) ((x) * 180.0f / M_PI)
We are starting to use the actual values of angles in this tutorials. As it happens, the trigonometric functions of the standard C library take radian as a parameter. The above macros take the angle either in radians or degrees and convert to the other notation.
inline Matrix4f operator*(const Matrix4f& Right) const
{
    Matrix4f Ret;
    for (unsigned int i = 0 ; i < 4 ; i++) {
       for (unsigned int j = 0 ; j < 4 ; j++) {
           Ret.m[i][j] = m[i][0] * Right.m[0][j] +
                         m[i][1] * Right.m[1][j] +
                         m[i][2] * Right.m[2][j] +
                         m[i][3] * Right.m[3][j];
       }
    }

    return Ret;
}
This handy operator of the matrix class handles matrix multiplication. As you can see, each entry in the resulting matrix is defined as the dot product of its line in the left matrix with the column in the right matrix. This operator is key in the implementation of the pipeline class.
class Pipeline
{
    public:
       Pipeline() { ... }
       void Scale(float ScaleX, float ScaleY, float ScaleZ) { ... }
       void WorldPos(float x, float y, float z) { ... }
       void Rotate(float RotateX, float RotateY, float RotateZ) { ... }
       const Matrix4f* GetTrans();
    private:
       Vector3f m_scale;
       Vector3f m_worldPos;
       Vector3f m_rotateInfo;
       Matrix4f m_transformation;
};
The pipeline class abstracts the details of getting all the transformation required for one single object combined. There are currently 3 private member vectors that store the scaling, position in world space and rotation for each axis. In addition there are APIs to set them and a function to get the matrix that represent the sum of all these transformations.
const Matrix4f* Pipeline::GetTrans()
{
    Matrix4f ScaleTrans, RotateTrans, TranslationTrans;
    InitScaleTransform(ScaleTrans);
    InitRotateTransform(RotateTrans);
    InitTranslationTransform(TranslationTrans);
    m_transformation = TranslationTrans * RotateTrans * ScaleTrans;
    return &m_transformation;
}
This function initializes three seperate matrices as the transformations that match the current configuration. It multiplies them one by one and returns the final product. Note that the order is hard coded and follows the description above. If you need some flexibility there you can use a bitmask that specifies the order. Also note that it always stores the final transformation as a member. You can try optimizing this function by checking a dirty flag and returning the stored martix in the case that there was no change in configuration since the last time this function was called.
This function uses private methods to generate the different transformations according to what we've learned in the last few tutorials. In the next tutorials this class will be extended to handle camera control and perspective projection.
Pipeline p;
p.Scale(sinf(Scale * 0.1f), sinf(Scale * 0.1f), sinf(Scale * 0.1f));
p.WorldPos(sinf(Scale), 0.0f, 0.0f);
p.Rotate(sinf(Scale) * 90.0f, sinf(Scale) * 90.0f, sinf(Scale) * 90.0f);
glUniformMatrix4fv(gWorldLocation, 1, GL_TRUE, (const GLfloat*)p.GetTrans());
These are the changes to the render function. We allocate a pipeline object, configure it and send the resulting transformation down to the shader. Play with the parameters and see their effect on the final image.
For more information on this subject check out the following video tutorial by Frahaan Hussain.
  

Background

We have finally reached the item that represents 3D graphics best - the projection from the 3D world on a 2D plane while maintaining the appearance of depth. A good example is a picture of a road or railway-tracks that seem to converge down to a single point far away in the horizon.
We are going to generate the transformation that satisfies the above requirement and we have an additional requirement we want to "piggyback" on it which is to make life easier for the clipper by representing the projected coordinates in a normalized space of -1 to +1. This means the clipper can do its work without having knowledge of the screen dimension and the location of the near and far planes.
The perspective projection tranformation will require us to supply 4 parameters:
  1. The aspect ratio - the ratio between the width and the height of the rectangular area which will be the target of projection.
  2. The vertical field of view: the vertical angle of the camera through which we are looking at the world.
  3. The location of the near Z plane. This allows us to clip objects that are too close to the camera.
  4. The location of the far Z plane. This allows us to clip objects that are too distant from the camera.
The aspect ratio is required since we are going to represent all coordinates in a normalized space whose width is equal to its height. Since this is rarely the case with the screen where the width is usually larger than the height it will need to be represented in the transformation by somehow "condensing" the points on the horizontal line vs. the vertical line. This will enable us to squeeze in more coordinates in terms of the X component in the normalized space which will satisfy the requirement of "seeing" more on the width than on the height in the final image.
The vertical field of view allows us to zoom in and out on the world. Consider the following example. In the picture on the left hand side the angle is wider which makes objects smaller while in the picture on the right hand side the angle is smaller which makes the same object appear larger. Note that this has an effect on the location of the camera which is a bit counter intuitive. On the left (where we zoom in with a smaller field of view) the camera needs to be placed further away and on the right it is closer to the projection plane. However, remember that this has no real effect since the projected coordinates are mapped to the screen and the location of the camera plays no part.
We start by determining the distance of the projection plane from the camera. The projection plane is a plane which is parallel to the XY plane. Obviously, not the entire plane is visible because this is too much. We can only see stuff in a rectangular area (called the projection window) which has the same proportions of our screen. The apsect ratio is calculated as follows:
ar = screen width / screen height
Let us conviniently determine the height of the projection window as 2 which means the width is exactly twice the aspect ratio (see the above equation). If we place the camera in the origin and look at the area from behind the camera's back we will see the following:
Anything outside this rectangle is going to be clipped away and we already see that coordinates inside it will have their Y component in the required range. The X component is currently a bit bigger but we will provide a fix later on.
Now let's take a look at this "from the side" (looking down at the YZ plane):
We find the distance from the camera to the projection plane using the vertical field of view (denoted by the angle alpha):
The next step is to calculate the projected coordinates of X and Y. Consider the next image (again looking down at the YZ plane).
We have a point in the 3D world with the coordinates (x,y,z). We want to find (xp,yp) that represent the projected coordinates on the projection plane. Since the X component is out of scope in this diagram (it is pointing in and out of the page) we'll start with Y. According to the rule of similar triangles we can determine the following:
In the same manner for the X component:
Since our projection window is 2*ar (width) by 2 (height) in size we know that a point in the 3D world is inside the window if it is projected to a point whose projected X component is between -ar and +ar and the projected Y component is between -1 and +1. So on the Y component we are normalized but on the X component we are not. We can get Xp normalized as well by further dividing it by the aspect ratio. This means that a point whose projected X component was +ar is now +1 which places it on the right hand side of the normalized box. If its projected X component was +0.5 and the aspect ratio was 1.333 (which is what we get on an 1024x768 screen) the new projected X component is 0.375. To summarize, the division by the aspect ratio has the effect of condensing the points on the X axis.
We have reached the following projection equations for the X and Y components:
Before completing the full process let's try to see how the projection matrix would look like at this point. This means representing the above using a matrix. Now we run into a problem. In both equations we need to divide X and Y by Z which is part of the vector that represents position. However, the value of Z changes from one vertex to the next so you cannot place it into one matrix that projects all vertices. To understand this better think about the top row vector of the matrix (a, b, c, d). We need to select the values of the vector such that the following will hold true:
This is the dot product operation between the top row vector of the matrix with the vertex position which yields the final X component. We can select 'b' and 'd' to be zero but we cannot find an 'a' and 'c' that can be plugged into the left hand side and provide the results on the right hand side. The solution adopted by OpenGL is to seperate the transformation into two parts: a multiplication by a projection matrix followed by a division by the Z value as an independant step. The matrix is provided by the application and the shader must include the multiplication of the position by it. The division by the Z is hard wired into the GPU and takes place in the rasterizer (somewhere between the vertex shader and the fragment shader). How does the GPU knows which vertex shader output to divide by its Z value? simple - the built-in variable gl_Position is designated for that job. Now we only need to find a matrix that represents the projection equations of X & Y above.
After multiplying by that matrix the GPU can divide by Z automatically for us and we get the result we want. But here's another complexity: if we multiply the matrix by the vertex position and then divide it by Z we literally loose the Z value because it becomes 1 for all vertices. The original Z value must be saved in order to perform the depth test later on. So the trick is to copy the original Z value into the W component of the resulting vector and divide only XYZ by W instead of Z. W maintains the original Z which can be used for depth test. The automatic step of dividing gl_Position by its W is called 'perspective divide'.
We can now generate an intermediate matrix that represents the above two equations as well as the copying of the Z into the W component:
As I said earlier, we want to include the normalization of the Z value as well to make it easier for the clipper to work without knowing the near and far Z values. However, the matrix above turns Z into zero. Knowing that after transforming the vector the system will automatically do perspective divide we need to select the values of the third row of the matrix such that following the division any Z value within viewing range (i.e. NearZ <= Z <= FarZ) will be mapped to the [-1,1] range. Such a mapping operation is composed of two parts. First we scale down the range [NearZ, FarZ] down to any range with a width of 2. Then we move (or translate) the range such that it will start at -1. Scaling the Z value and then translating it is represented by the general function:
But following perspective divide the right hand side of the function becomes:
Now we need to find the values of A and B that will perform the maping to [-1,1]. We know that when Z equals NearZ the result must be -1 and that when Z equals FarZ the result must be 1. Therefore we can write:
Now we need to select the third row of the matrix as the vector (a b c d) that will satisfy:
We can immediately set 'a' and 'b' to be zero because we don't want X and Y to have any effect on the transformation of Z. Then our A value can become 'c' and the B value can become 'd' (since W is known to be 1).
Therefore, the final transformation matrix is:
After multiplying the vertex position by the projection matrix the coordinates are said to be in Clip Space and after performing the perspective divide the coordinates are in NDC Space (Normalized Device Coordinates).
The path that we have taken in this series of tutorials should now become clear. Without doing any projection we can simply output vertices from the VS whose XYZ components (of the position vector) are within the range of [-1,+1]. This will make sure they end up somewhere in the screen. By making sure that W is always 1 we basically prevent perspective divide from having any effect. After that the coordinates are transformed to screen space and we are done. When using the projection matrix the perspective divide step becomes an integral part of the 3D to 2D projection.

Source walkthru

void Pipeline::InitPerspectiveProj(Matrix4f& m) const>
{
    const float ar = m_persProj.Width / m_persProj.Height;
    const float zNear = m_persProj.zNear;
    const float zFar = m_persProj.zFar;
    const float zRange = zNear - zFar;
    const float tanHalfFOV = tanf(ToRadian(m_persProj.FOV / 2.0));

    m.m[0][0] = 1.0f / (tanHalfFOV * ar); 
    m.m[0][1] = 0.0f;
    m.m[0][2] = 0.0f;
    m.m[0][3] = 0.0f;

    m.m[1][0] = 0.0f;
    m.m[1][1] = 1.0f / tanHalfFOV; 
    m.m[1][2] = 0.0f; 
    m.m[1][3] = 0.0f;

    m.m[2][0] = 0.0f; 
    m.m[2][1] = 0.0f; 
    m.m[2][2] = (-zNear - zFar) / zRange; 
    m.m[2][3] = 2.0f * zFar * zNear / zRange;

    m.m[3][0] = 0.0f;
    m.m[3][1] = 0.0f; 
    m.m[3][2] = 1.0f; 
    m.m[3][3] = 0.0f;
}
A structure called m_persProj was added to the Pipeline class that holds the perspective projection configurations. The method above generates the matrix that we have developed in the background section.
m_transformation = PersProjTrans * TranslationTrans * RotateTrans * ScaleTrans;
We add the perspective projection matrix as the first element in the multiplication that generates the complete transformation. Remember that since the position vector is multiplied on the right hand side that matrix is actually the last. First we scale, then rotate, translate and finally project.
p.SetPerspectiveProj(30.0f, WINDOW_WIDTH, WINDOW_HEIGHT, 1.0f, 1000.0f);
In the render function we set the projection parameters. Play with these and see their effect.
For more information on this subject check out the following video tutorial by Frahaan Hussain.




Background

In the last several tutorials we saw two types of transformations. The first type were transformations that change the position (translation), orientation (rotation) or size (scaling) of an object. These transformations allow us to place an object anywhere within the 3D world. The second type was the perpsective projection transformation that takes the position of a vertex in the 3D world and projects it into a 2D world (i.e. a plane). Once the coordinates are in 2D it is very easy to map them to screen space coordinates. These coordinates are used to actually rasterize the primitives from which the object is composed (be it points, lines or triangles).
The missing piece of the puzzle is the location of the camera. In all the previous tutorials we implicitly assumed that the camera is convenietly located at the origin of the 3D space. In reality, we want to have the freedom to place the camera anywhere in the world and project the vertices into some 2D plane infront of it. This will reflect the correct relation between the camera and the object on screen.
In the following picture we see the camera positioned somewhere with its back to us. There is a virtual 2D plane before it and the ball is projected into the plane. The camera is tilted somewhat so the plane is tilted accordingly. Since the view from the camera is limited by its field of view angle the visible part of the (endless) 2D plane is the rectangle. Anything outside it is clipped out. Getting the rectangle onto the screen is our target.









Theoretically, it is possible to generate the transformations that would take an object in the 3D world and project it onto a 2D plane lying infront of a camera positioned in an arbitrary location in the world. However, that math is much more complex than what we have previously seen. It is much more simple to do it when the camera is stationed at the origin of the 3D world and looking down the Z axe. For example, an object is positioned at (0,0,5) and the camera is at (0,0,1) and looking down the Z axe (i.e. directly at the object). If we move both the camera and the object by one unit towards the origin then the relative distance and orientation (in terms of the direction of the camera) remains the same only now the camera is positioned at the origin. Moving all the objects in the scene in the same way will allow us to render the scene correctly using the methods that we have already learned.
The example above was simple because the camera was already looking down the Z axe and was in general aligned to the axes of the coordinate system. But what happens if the camera is looking somewhere else? Take a look at the following picture. For simplicity, this is a 2D coordinate system and we are looking at the camera from the top.










The camera was originally looking down the Z axe but then turned 45 degrees clockwise. As you can see, the camera defines its own coordinate system which may be identical to the world (upper picture) and may be different (lower picture). So there are actually two coordinate systems simulatenously. There is the 'world coordinate system' in which the objects are specified and there is a camera coordinate system which is aligned with the "axes" of the camera (target, up and right). These two coordinate systems are known as 'world space' and 'camera/view space'.
The green ball is located on (0,y,z) in world space. In camera space it is located somewhere in the upper left quadrant of the coordinate system (i.e. it has a negative X and a positive Z). We need to find out the location of the ball in camera space. Then we can simply forget all about the world space and use only the camera space. In camera space the camera is located at the origin and looking down the Z axe. Objects are specified relative to the camera and can be rendered using the tools we have learned.
Saying that the camera turned 45 degrees clockwise is the same as saying that the green ball turned 45 degrees counter-clockwise. The movement of the objects is always opposite to the movement of the camera. So in general, we need to add two new transformations and plug them into the transformation pipeline that we already have. We need to move the objects in a way that will keep their distance from the camera the same while getting the camera to the origin and we need to turn the objects in the opposite direction from the direction the camera is turning to.
Moving the camera is very simple. If the camera is located at (x,y,z), then the translation transformation is (-x, -y, -z). The reason is straightforward - the camera was placed in the world using a translation transformation based on the vector (x,y,z) so to move it back to the origin we need a translation transformation based on the opposite of that vector. This is how the transformation matrix looks like:









The next step is to turn the camera toward some target specified in world space coordinates. We want to find out the location of the vertices in the new coordinate system that the camera defines. So the actual question is: how do we transform from one coordinate system to another?
Take another look at the picture above. We can say that the world coordinate system is defined by the three linearly independent unit vectors (1,0,0), (0,1,0) and (0,0,1). Linearly independent means that we cannot find x,y and z that are not all zeros such that x*(1,0,0) + y(0,1,0) + z*(0,0,1) = (0,0,0). In more geometrical terms this means that any pair of vectors out of these three defines a plane which is perpendicular to the third vector (plane XY is perpedicular to Z axe, etc). It is easy to see that the camera coordinate system is defined by the vectors (1,0,-1), (0,1,0), (1,0,1). After normalizing these vectors we get (0.7071,0,-0.7071), (0,1,0) and (0.7071,0,0.7071).
The following image shows how the location of a vector is specified in two different coordinate systems:









We know how to get the unit vectors that represent the camera axes in world space and we know the location of the vector in world space (x,y,z). What we are looking for is the vector (x',y',z'). We now take advantage of an attribute of the dot product operation known as 'scalar projection'. Scalar projection is the result of a dot product between an arbitrary vector A and a unit vector B and results in the magnitude of A in the direction of B. In other words, the projection of vector A on vector B. In the example above if we do a dot product between (x,y,z) and the unit vector that represents the camera X axe we get x'. In the same manner we can get y' and z'. (x',y',z') is the location of (x,y,z) in camera space.
Let's see how to turn this knowledge into a complete solution for orienting the camera. The solution is called 'UVN camera' and is just one of many systems to specify the orientation of a camera. The idea is that the camera is defined by the following vectors:
  1. N - The vector from the camera to its target. Also known as the 'look at' vector in some 3D literature. This vector corresponds to the Z axe.
  2. V - When standing upright this is the vector from your head to the sky. If you are writing a flight simulator and the plane is reversed that vector may very well point to the ground. This vector corresponds to the Y axe.
  3. U - This vector points from the camera to its "right" side". It corresponds to the X axe.
In order to transform a position in world space to the camera space defined by the UVN vectors we need to perform a dot product operation between the position and the UVN vectors. A matrix represents this best:
In the code that accompanies this tutorial you will notice that the shader global variable 'gWorld' has been renamed 'gWVP'. This change reflects the way the series of transformations is known in many textbooks. WVP stands for - World-View-Projection.

Source walkthru

In this tutorial I decided to make a small design change and moved the low level matrix manipulation code from the Pipeline class to the Matrix4f class. The Pipeline class now tells Matrix4f to initialize itself in different ways and concatenates several matrices to create the final transformation.
(pipeline.h:85)
struct { 
    Vector3f Pos; 
    Vector3f Target;
    Vector3f Up;
} m_camera;
The Pipeline class has a few new members to store the parameters of the camera. Note that the axe that points from the camera to it's "right" is missing (the 'U' axe). It is calculated on the fly using a cross product between the target and up axes. In addition there is a new function called SetCamera to pass these values.
(math3d.h:21)
Vector3f Vector3f::Cross(const Vector3f& v) const 
{
    const float _x = y * v.z - z * v.y;
    const float _y = z * v.x - x * v.z;
    const float _z = x * v.y - y * v.x;

    return Vector3f(_x, _y, _z);
}
The Vector3f has a new method to calculate the cross product between two Vector3f objects. A cross product between two vectors produces a vector which is perpendicular to the plane defined by the vectors. This becomes more intuitive when you remember that vectors have a direction and magnitude but no position. All vectors with the same direction and magnitude are considered equal, regardless where they "start". So you might as well make both vectors start at the origin. This means that you can create a triangle that has one vertex at the origin and two vertices at the tip of the vectors. The triangle defines a plane and the cross product is a vector which is perpendicular to that plane. Read more on the cross product in Wikipedia.
(math3d.h:30)
Vector3f& Vector3f::Normalize()
{
    const float Length = sqrtf(x * x + y * y + z * z);

    x /= Length;
    y /= Length;
    z /= Length;

    return *this;
}
To generate the UVN matrix we will need to make the vectors unit length. This operation is formally known as 'vector normalization' is executed by dividing each vector component by the vector length. More on this in Mathworld.
(math3d.cpp:84)
void Matrix4f::InitCameraTransform(const Vector3f& Target, const Vector3f& Up)
{
    Vector3f N = Target;
    N.Normalize();
    Vector3f U = Up;
    U = U.Cross(Target);
    U.Normalize();
    Vector3f V = N.Cross(U);

    m[0][0] = U.x; m[0][1] = U.y; m[0][2] = U.z; m[0][3] = 0.0f;
    m[1][0] = V.x; m[1][1] = V.y; m[1][2] = V.z; m[1][3] = 0.0f;
    m[2][0] = N.x; m[2][1] = N.y; m[2][2] = N.z; m[2][3] = 0.0f;
    m[3][0] = 0.0f; m[3][1] = 0.0f; m[3][2] = 0.0f; m[3][3] = 1.0f;
}
This function generates the camera transformation matrix that will be used later by the pipeline class. The U,V and N vectors are calculated and set into the matrix in rows. Since the vertex position is going to be multiplied on the right side (as a column vector) this means a dot product between U,V and N and the position. This generates the 3 scalar projections magnitude values that become the XYZ values of the position in screen space.
The function is supplied with the target and up vectors. The "right" vector is calculated as the cross product between them. Note that we do not trust the caller to pass unit length vectors so we normalize the vectors anyway. After generating the U vector we recalculate the up vector as a cross product between the target and the right vector. The reason will become clearer in the future when we will start moving the camera. It is simpler to update only the target vector and leave the up vector untouched. However, this means that the angle between the target and the up vectors will not be 90 degrees which makes this an invalid coordinate system. By calculating the right vector as a cross product of the target and the up vectors and then recalculating the up vector as a cross product between the target and the right we get a coordinate system with 90 degrees between each pair of axes.
(pipeline.cpp:22)
const Matrix4f* Pipeline::GetTrans()
{
    Matrix4f ScaleTrans, RotateTrans, TranslationTrans, CameraTranslationTrans, CameraRotateTrans, PersProjTrans;

    ScaleTrans.InitScaleTransform(m_scale.x, m_scale.y, m_scale.z);
    RotateTrans.InitRotateTransform(m_rotateInfo.x, m_rotateInfo.y, m_rotateInfo.z);
    TranslationTrans.InitTranslationTransform(m_worldPos.x, m_worldPos.y, m_worldPos.z);
    CameraTranslationTrans.InitTranslationTransform(-m_camera.Pos.x, -m_camera.Pos.y, -m_camera.Pos.z);
    CameraRotateTrans.InitCameraTransform(m_camera.Target, m_camera.Up);
    PersProjTrans.InitPersProjTransform(m_persProj.FOV, m_persProj.Width, m_persProj.Height, m_persProj.zNear, m_persProj.zFar);

    m_transformation = PersProjTrans * CameraRotateTrans * CameraTranslationTrans * TranslationTrans * RotateTrans * ScaleTrans;
    return &m_transformation;
}
Let's update the function that generates the complete transformation matrix of an object. It is now becoming quite complex with two new matrices that provide the camera part. After completing the world transformation (the combined scaling, rotation and translation of the object) we start the camera transformation by "moving" the camera to the origin. This is done by a translation using the negative vector of the camera position. So if the camera is positioned at (1,2,3) we need to move the object by (-1,-2,-3) in order to get the camera back to the origin. After that we generate the camera rotation matrix based on the camera target and up vectors. This completes the camera part. Finally, we project the coordinates.
(tutorial13.cpp:76)
Vector3f CameraPos(1.0f, 1.0f, -3.0f);
Vector3f CameraTarget(0.45f, 0.0f, 1.0f);
Vector3f CameraUp(0.0f, 1.0f, 0.0f);
p.SetCamera(CameraPos, CameraTarget, CameraUp);
We use the new capability in the main render loop. To place the camera we step back from the origin along the negative Z axe, then move to the right and straight up. The camera is looking along the positive Z axe and a bit to the right from the origin. The up vector is simply the positive Y axe. We set all this into the Pipeline object the Pipeline class takes care of the rest.
For more information on this subject check out the following video tutorial by Frahaan Hussain.


 Continua pana la Tutorial 20



No comments:

Post a Comment