Headline Interviewer: enter a URL in the browser address bar and press enter. What are the technical steps behind it?

Programmer a mu 2021-08-06 11:22:09 阅读数:479

本文一共[544]字,预计阅读时长:1分钟~
headline interviewer enter url browser

 

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

This is a classic interview question for the headline school recruitment test , If you look at the collection of facial sutras , Will find 3 At least one person appeared in the interview 1 Time , It is also a very basic knowledge point .

Okay , Don't say nonsense .

Basic information of synchronization status and identification hash code , altogether 6 A step :

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

  1. DNS analysis

  2. TCP Connect

  3. send out http request

  4. Server processing request

  5. Browser parse render page

  6. End of connection

DNS analysis

What is? DNS?

DNS It is a naming system for computer and network services organized into domain hierarchies , He used TCP/IP The Internet , The service it provides is used to convert host names and domain names into IP Address work .DNS This is one of them " The interpreter ", Its basic working principle can be represented by the following figure .

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

Root domain server (root Name server) Is the Internet domain name resolution system (DNS) The highest level domain name server in , Responsible for returning the authoritative domain name server address of the top-level domain . They are an important part of the Internet infrastructure , Because all domain name resolution operations are inseparable from them . because DNS And certain agreements ( Non fragmented user data protocol (UDP) The packet is in IPv4 The maximum valid size within is 512 byte ) Common limitations of , The number of root domain server addresses is limited to 13 individual .

Top-level domain name (TLD), It's the highest level domain name . In short , It's the last part of the website . such as , website www.baidu.com The top-level domain name of is .com. They can be divided into two categories . One is general top-level domain name (gTLD) It can also be called general top-level domain name , such as .com、net、.edu、.org wait , share 700 Multiple . The other is country top-level domain names (ccTLD), Representing different countries and regions , such as .cn( China )、.io( British Indian Ocean Territory ) etc. , share 300 Multiple .

Name server (Name Server): In the Internet, it refers to the program or server that provides domain name service agreement . It can be " Human beings can recognize " Identifier , It is mapped to the identification code in the system, usually in digital form . The domain name system (DNS) Server is the most famous name server : Domain name is one of the two main name spaces on the Internet .

DNS Analytic process

  1. Check whether the domain name corresponding to... Has been cached in the browser cache ip Address

  2. If not found in the browser cache ip, Then we will continue to find whether the local system has been cached ip

  3. Initiate a domain name resolution request to the local domain name resolution service

  4. Send a domain name resolution request to the root domain name resolution server

  5. The root domain server returns gTLD( Common top level domain ) Domain name resolution server address

  6. towards gTLD The server initiates a resolution request

  7. gTLD The server receives the request and returns Name Server The server

  8. Name Server Server return ip Address to local server

  9. Local domain name server cache resolution results

  10. Return the parsing result to the user

DNS Load balancing

DNS The implementation principle of load balancing technology is DNS Configure multiple... For the same hostname in the server IP Address , In response to NDS When inquiring ,DNS The server will use DNS Recorded by the host in the file IP The address returns different parsing results in order , Guide client access to different machines , Make different clients access different servers , So as to achieve the goal of load balancing .

TCP Connect

The purpose of the three handshakes

The purpose is to prevent the invalid connection request message segment from being suddenly transmitted to the server , So there's a mistake .

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

Three handshakes :

  1. The client sends a SYN=1,Seq=X To the server ( The first handshake , Browser initiated , Tell the server I'm going to send a request )

  2. The server plays a band SUN=1,ACK=Y Response package to convey the confirmation message ( The second handshake , Initiated by the server , Tell the browser I'm ready to receive , It's ready to send )

  3. The client returns another tape ACK=Y+1,Seq=Z Datagram , The handshake is over ( The third handshake , Sent by browser , Tell the server , I'm ready to send )

Message format :

  1. Serial number :Seq Serial number (Sequence number Sequence number ), Occupy 32 position , Used to identify from TCP Byte stream sent from source to destination , Flag this when the initiator sends data .

  2. Confirm the serial number :Ack Serial number (Acknowledge number Confirmation number ), Occupy 32 position , Only ACK Sign bit is 1 when , Verify that the ordinal field is valid ,Ack=Seq+1.

  3. Sign a ( Bit code ): common 6 individual , namely URG、ACK、PSH、RST、SYN、FIN, The specific meaning is as follows :

  • URG:urgent, emergency . Pointer to an emergency (urgent pointer) It works .

  • ACK:acknowledgement, confirm . Confirm that the serial number is valid .

  • PSH:push, delivery . The receiver should send this message to the application layer as soon as possible .

  • RST:reset, Reset . Reset connection .

  • SYN:synchronous, Establish online . Initiate a new connection .

  • FIN:finish, end . Release a connection .

It should be noted that :

  1. Do not confirm the serial number Ack And in flags ACK Confused.

  2. Confirmation party Ack= Initiator Seq+1, Pairing at ends .

(°ー°〃) Let's explain the next three handshakes in vernacular

Express little elder brother : Hello , Your express is here , You're not at home ? Xiao Ming : At home , Send it over . Express little elder brother : well , Send it to... Right away .


send out HTTP request

The request message is sent by the request line , Request header , Blank line , The request body consists of four parts .

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

The request line contains the request method ,URL, Protocol version

  • The request method includes :GET、POST、PUT、DELETE、PATCH、HEAD、OPTIONS、TRACE.

  • URL The address of the request

  • The agreement version is http Version number

GET /js/count.js HTTP/1.1

In the above code GET On behalf of the request method ,/js/count.js Express URL,HTTP/1.1 representative http edition

The request line contains the additional information requested , By keyword / Value pairs , One pile per line , Use English colons for keywords and values ":" Separate .

The request header notifies the server of information about the client request . It contains a lot of useful information about the client environment and the request body . such as :

  • Host: Host name , Virtual host

  • Connection:HTTP/1.1 To increase the , Use keeoalive, That is, persistent connections , A connection can send multiple requests

  • User-Agent: Client program information , Is the browser information I sent the request

  • Accept: The type of media data that the browser can receive

  • Accept-Encoding: The browser is used to inform the server of the content encoding it can support and the priority order of content encoding , Multiple content codes can be specified at one time

  • Accept-Language: high hi The natural language set that the server browser can handle

  • Cookie: User related information recorded by the browser

Request body : It can carry data of multiple request parameters , Include carriage return 、 Line breaks and request data , Not all requests have request data .

The server processes the request and returns HTTP message

The response message consists of a phase response line 、 Response head 、 The response body consists of three parts , Here's the picture

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

The response line contains the protocol version 、 Status code 、 Status code description

HTTP/1.1 200 OK

  • Protocol version :HTTP/1.1

  • Status code :200

  • 200: The request is successful

  • 201: In order to create , Successfully requested and created a new resource

  • 203: Unauthorized information . The request is successful , But returned meta The information is not on the original server , It's a copy

  • 204: There is no content . Server processed successfully , But no content returned . Without updating the page , Ensures that the browser continues to display the current document

  • 301: Permanent redirection

  • 302: Temporary redirection

  • 307: Temporary redirection . And 302 similar . Use GET request redirections

  • 400: Syntax error in client request , Server does not understand ( The parameters passed to the server are different from the accepted fields specified by the server )

  • 404: The server could not find the resource at the request of the client

  • 405: Method in client request is forbidden ( The request method is wrong , For example, server settings GET request , Client side usage POST request )

  • 500: Server internal error

  • Status code description :ok

Response head

The response header provides the client with additional information , So that the client can respond better .

  • Server: The server tells the client what is currently installed on the server HTTP Information about the service application , May contain the name of the software application on the server , Version number

  • Content-Type: Indicates the type of entity content returned by the server to the browser

  • Transfer-Encoding: chunked Indicates that the length of the output cannot be determined , Ordinary static pages 、 Pictures and the like basically don't use this . Dynamic pages may use .

  • Cache-Control: Cache control , The default value is private, Indicates that the content is only cached in the private cache ( Only clients can cache , The proxy server is not cacheable )

  • Expires: Tell the client the expiration date of the resource

Response subject

The text information returned by the server to the client

Browser parse render page

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

Critical render path

The key rendering path refers to the browser receiving the request from the beginning HTML、CSS、JS And so on , And then parse , Building tree 、 Render layout 、 draw , Finally, the whole process of presenting the interface that the customer can see

It mainly includes the following steps

  1. analysis HTML Generate DOM Trees

  2. analysis CSS Generate CSSOM The rule tree

  3. take DOM Trees and CSSOM Combining rule trees to generate rendering trees

  4. Traverse render tree to start layout , Calculate the location size information of each node

  5. Draw each node of the rendering tree to the screen

structure DOM Trees

When the browser receives a response from the server HTML After the documents , Traversing document nodes , Generate DOM Trees . It should be noted that ,DOM Tree generation may be CSS and JS The load execution is blocked .

structure CSSOM Rule book

Browser parsing CSS File and generate CSS The rule tree , Every CSS The files are parsed into a StyleSheet object , Each object contains CSS The rules .CSS The rule object contains the corresponding to CSS Syntax selector and declaration objects and other objects

Rendering blocking

When the browser encounters a script When the tag ,DOM The build will pause , Until the script is finished executing , Then continue to build DOM. Every time I go to execute Js Scripts are severely blocked DOM The construction of trees , If js The script also operates CSSOM, And just this one CSSOM Not yet downloaded and built , Browsers even delay script execution and build DOM, Until it's done CSSOM Download and build .

therefore script The location of the label is very important . In actual use , Two principles can be followed :

  1. CSS first : In the order of introduction ,CSS Resources precede JS resources .

  2. JS Put it back : Usually we put JS Put the code at the bottom of the page , And JS We should try to have as little influence as possible DOM structure

Build the render tree

adopt DOM Trees and CSS Rule tree, we can build a rendering tree . The browser will start with DOM The root node of the tree begins traversing each visible node . For each visible node , Find the appropriate CSS Pattern rules and apply .

After the rendering tree is built , Each node is visible and contains its content and the style of the corresponding rule . This is also the rendering tree and DOM The biggest difference between trees . The rendering tree is used to display , Of course, those invisible elements will not appear in this tree , besides ,display be equal to none Will not be shown in this tree , however visibility be equal to hidden The elements of will be displayed in this tree .

Render tree layout

The layout phase starts from the root node of the rendering tree , Then determine the exact size and location of each receiving object on the page , The output of the layout phase is a box model , He will accurately capture the exact position and size of each element on the screen .

Rendering tree drawing

In the drawing phase , Traverse the rendering tree , Call the paint() Method to display its contents on the screen . Rendering tree is done by browser UI Back end components complete

Reflow and redraw

According to the selected Enron tree layout , Calculation CSS style , That is, geometric information such as line and position of each node in the page .HTML The default is flow layout ,CSS and JS Will break the layout , change DOM The appearance, style, size and location of . This triggers reflow and redraw

Repaint

Part of the screen redraw , Does not affect the overall layout , For example, a certain CSS The background color of , But the geometry and position of the elements are the same .

Common attributes that cause replay

  • color

  • border-style

  • box-shadow

  • background

  • background-size

  • border-radius

  • background-position

backflow

When the size and position of the element change , Need to revalidate and compute the render tree . Part or all of the rendering tree has changed .

Common attributes and methods that cause backflow

  • Add or remove visible DOM Elements

  • Element size change -- Margin 、 fill 、 Frame 、 Width and height

  • Content change , Let's say the user input Enter text in

  • Browser window size changed

  • Calculation offsetWidth and offsetHeight

As can be seen from the above : Reflow must cause redrawing , Redrawing does not necessarily cause reflow .

Browser's rendering queue

Think about the following code that triggers several renderings ?

div.style.left = '10px'; div.style.top = '10px'; div.style.width = '20px'; div.style.height = '20px';

This code will theoretically trigger 4 Second redraw and reflow , Because the collection attribute of the element is changed every time , In fact, the final expenditure method has a return , It all benefits from

Browser's rendering queue mechanism

When the browser finds that a line of code is changing the element style , The browser does not render immediately , But slow down the shivering , See if your next line of code is changing the style , If you change the style on the next line , I'm shivering , If you find that several lines of code are changing the style , The browser will wait for these lines of code to complete , Before rendering , This is the browser's rendering queue mechanism

Animation effect application position The attribute is absolute or fixed Element ( Off stream )

This method also causes backflow , But it will have no effect on other elements , Can improve performance

css3 Hardware acceleration (GPU Speed up )

Hardware acceleration will automatically avoid backflow and redrawing css Here are a few more attributes that can trigger hardware acceleration

  1. transform

  2. opacity

  3. filter

  4. will-change

If there are some elements that do not need the above attributes , But you need to trigger the hardware acceleration effect , You can use some tips to induce browsers to turn on hardware acceleration .

-webkit-transform: translateZ(0); -moz-transform: translateZ(0); -ms-transform: translateZ(0); -o-transform: translateZ(0); transform: translateZ(0); / perhaps / transform: rotateZ(360deg); transform: translate3d(0, 0, 0);

What to pay attention to

  • Too much hardware acceleration may consume more memory .

  • GPU Rendering will affect the anti aliasing effect of Fonts . This is because GPU and CPU With different rendering mechanisms , Even if the hardware acceleration finally stops , The text will still be blurred during the animation .

disconnect

Now the page in order to optimize the time-consuming request , Persistent connections are enabled by default (keep-alive), So one TCP The exact time the connection was closed , This is it tab When the tabs are closed . The closing process is four waves . because TCP Full duplex on connection , therefore , Each direction must be closed separately , The principle is that when one party completes the data transmission task , Send a FIN To terminate the connection in this direction , Receive a FIN It just means there's no data flow in this direction , No more data , But here TCP Data can still be sent on the connection , Until it's sent in that direction FIN. Active shutdown will be performed by the first party to close , The other party performs a passive shutdown

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=
  1. client Send a FIN, Used to close client To server Data transfer of ,cliient Get into FIN_WAIT_1 state

  2. server received FIN after , Send a ack to client, Confirm that the serial number is the received serial number +1( And SYN identical , One FIN Occupy a sequence number ),server Get into CLOSE_WAIT state

  3. server Send a FIN, Used to close server To client Data transfer of ,server Get into LAST_ACK state

  4. client received FIN after ,client Get into TIME_WAIT state , And then send a ack to server, Confirm that the serial number is the received serial number +1,server Get into CLOSED state , Finish four waves

State details :

**CLOSED:** Represents the initial state .

**LISTEN:** Represents some... On the server side SOCKET In a listening state , The connection is acceptable .

**SYN_RCVD:** This state indicates acceptance of SYN message , Under normal circumstances , This state is on the server side SOCKET In establishment TCP An intermediate state in a three-way handshake session when connecting , It's short , Basically used netstat It's hard for you to see this kind of state , Unless you specifically write a client test program , Deliberately three times TCP The last one in the handshake ACK The message will not be sent . So in this state , When received from client ACK After the message , It's going to go in ESTABLISHED state .

**SYN_SENT:** This state is related to SYN_RCVD To echo in the distance , When the client SOCKET perform CONNECT When the connection , It first sends SYN message , So then it goes into SYN_SENT state , And wait for the third handshake sent by the server 2 A message .SYN_SENT Status indicates that the client has sent SYN message .

**ESTABLISHED:** Indicates that the connection has been established .

**FIN_WAIT_1:** This state needs to be explained , Actually FIN_WAIT_1 and FIN_WAIT_2 The real meaning of state is waiting for each other FIN message . And the difference between these two states is :FIN_WAIT_1 State is actually when SOCKET stay ESTABLISHED In the state of , It wants to actively close the connection , Sent... To the other party FIN message , At this point the SOCKET That is to enter into FIN_WAIT_1 state . And when the other side responds ACK After the message , Then go to FIN_WAIT_2 state , Of course, in the actual normal situation , No matter what the other party's situation , We should respond immediately ACK message , therefore FIN_WAIT_1 State is generally more difficult to see , and FIN_WAIT_2 State can be used sometimes netstat notice .

**FIN_WAIT_2:** This state has been explained in detail above , actually FIN_WAIT_2 In state SOCKET, Indicates a half connection , That is to say, there is a demand for close Connect , But also tell the other person , I still have some data to transmit to you , Close the connection later .

**TIME_WAIT:** It means that I have received FIN message , And sent out ACK message , Just wait 2MSL(Max Segment Lifetime) And then you can go back to CLOSED Available state . If FIN_WAIT_1 State, , Received the other side at the same time with FIN Logo and ACK When the message is marked , You can go directly to TIME_WAIT state , Without going through FIN_WAIT_2 state .

**CLOSING:** This state is quite special , In fact, it should be very rare , It belongs to a rare exception . Under normal circumstances , When you send FIN After the message , It should be received first ( Or received at the same time ) The other person's ACK message , I'll get it back FIN message . however CLOSING Status means you send FIN After the message , Didn't receive the other party's ACK message , But also received the other side's FIN message . When will this happen ? Actually, think about it , It's not hard to come to a conclusion : That is, if both sides are almost at the same time close One SOCKET Words , So there's a simultaneous delivery FIN Message situation , That is to say, there will be CLOSING state , That both sides are closing down SOCKET Connect .

**CLOSE_WAIT:** The meaning of this state is that it is waiting to close . How do you understand that ? When the other side close One SOCKET Post send FIN Message to yourself , Your system will undoubtedly respond to a ACK Message to the other party , At this point, you will enter CLOSE_WAIT state . The next? , In fact, what you really need to consider is to see if you still have data to send to the other party , If not , Then you can close This SOCKET, send out FIN Message to the other party , That is, close the connection . So you're CLOSE_WAIT State, , What needs to be done is waiting for you to close the connection .

**LAST_ACK:** It's passively shut down when the party is sending FIN After the message , Finally, wait for each other's ACK message . When I received ACK After the message , That is to say, you can enter CLOSED Available state .

 


watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

 

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=

 

This article is from WeChat official account. - Programmer Yifan (gh_6cafb826630a).
If there is any infringement , Please delete .

版权声明:本文为[Programmer a mu]所创,转载请带上原文链接,感谢。 https://car.inotgo.com/2021/08/20210806112006842c.html