Notice
Warning
Please follow the writing rules when requesting PR. Failure to comply may result in the PR being rejected.
- Please send feedback as a Pull Request, referring to how to request feedback through a Pull Request.
- Please refer to the Pull Request writing rules here.
- In addition to GitHub, you can also view it on the GitBook site.
However, we are planning to migrate to another website because the Latex syntax is different. Are you planning to migrate when the time comes?
- If you have any questions or tips to share, please use the Discussion.
- Community activation is always welcome!
- Please check here for the progress of the interview repo improvement project.
- As mentioned in the notice, progress may be slow.
Interview Questions
? Statistics/Math
- Please explain what eigen values and eigen vectors are and why they are important.
- Please tell me what sampling and resampling are and the advantages of resampling.
- What are probability models and random variables?
- What are cumulative distribution functions and probability density functions? Please express it with a formula.
- What is conditional probability?
- What are covariance and correlation coefficient? Please express it with a formula.
- What is the definition of a confidence interval?
- How would you explain p-value to someone who doesn't know it?
- What does R square mean?
- In which case should I use the mean or median?
- Why is the central limit theorem useful?
- Please explain entropy. Information Gain if possible.
- When can I use a parametric methodology, and when can I use a non-parametric methodology?
- What is the difference between “likelihood” and “probability”?
- What does bootstrap mean in statistics?
- In cases where there are very few parameters (a few dozen or less), how can a prediction model be established?
- Can you explain the difference between Bayesians and frequentists?
- What is statistical power?
- If there are missing values, should I fill them in? Why?
- What are the criteria for judging outliers?
- How do I calculate the sample size needed?
- How to control bias?
- When are logarithmic functions useful? Please explain with an example.
- Please explain Bernoulli distribution / Binomial distribution / Category distribution / Multinomial distribution / Gaussian normal distribution / t distribution / Chi-square distribution / F distribution / Beta distribution / Gamma distribution. Also, please explain the correlation between distributions.
- I'm about to board a plane for a business trip. You want to know if you should take an umbrella, so you randomly call three friends who live in your business trip and ask them independently if it's going to rain. Each friend tells the truth 2/3 times and a lie 1/3 times. All three friends said, “Yes. “It’s raining.” What is the probability that it will actually rain?
? machine learning
- Please explain the metrics you know. (ex. RMSE, MAE, recall, precision...)
- Why do we need normalization? What are the methods of normalization?
- Please explain Local Minima and Global Minimum.
- Please explain the curse of dimensionality.
- What are some common dimension reduction techniques?
- PCA is a dimensionality reduction technique, a data compression technique, and a noise removal technique. Can you explain why?
- Can you explain what the abbreviations such as LSA, LDA, SVD, etc. mean and how they are related to each other?
- What is the best way to explain Markov Chain to high school students?
- You need to extract topics from a pile of text. How will you approach it?
- Why does SVM work in the opposite way by expanding the dimension? Why is SVM good?
- Defend the merits of an old technique, naive Bayes, over other good machine learning techniques.
- What is the appropriate metric for regression/classification?
- Please explain the Support, Confidence, and Lift of the Association Rule.
- Do you know about Newton's Method and Gradient Descent among optimization techniques?
- Do you have any thoughts on the differences between the machine learning approach and the statistics approach?
- What are the general problems with artificial neural networks (traditional before deep learning)?
- What do you think is the basis of the deep learning innovations that are emerging now?
- Can you explain the ROC curve?
- You have 100 servers. At this time, why should Random Forest be used rather than Artificial Neural Network?
- What are the main semantic shortcomings of K-means? (Apart from the large amount of calculations)
- Please explain L1 and L2 regularization.
- What is Cross Validation and how do I do it?
- Do you know XGBoost? Why is this model famous on Kaggle?
- What are the ensemble methods?
- What is a feature vector?
- What is the definition of a good model?
- Are 50 small decision trees better than a large decision tree? Why do you think so?
- Why is logistic regression often used in spam filters?
- What is the formula for OLS (ordinary least square) regression?
? deep learning
- What is deep learning? What is the difference between deep learning and machine learning?
- What are Cost Function and Activation Function?
- What are the features and differences between Tensorflow and PyTorch?
- What is Data Normalization and why is it needed?
- Please tell us about the Activation Function you know. (Sigmoid, ReLU, LeakyReLU, Tanh, etc.)
- How should we deal with overfitting?
- What are hyperparameters?
- Please tell me about the Weight Initialization method. And what do you use a lot?
- What is a Boltzmann machine?
- What is your debugging know-how when using TF, PyTorch, etc.?
- What is the biggest drawback of neural nets? What is One-Shot Learning that came out for this?
- These days, ReLU is used more than Sigmoid. Why?
- What does the word Non-Linearity mean and why is it necessary?
- How to approximate a curved function with ReLU?
- What's wrong with ReLU?
- Why does bias exist?
- How would you explain Gradient Descent in simple terms?
- Why do you need to use Gradient? What are the horizontal and vertical axes in that graph? How would the graph be drawn in real life?
- Why do losses sometimes increase during GD?
- How would you explain Back Propagation in simple terms?
- Why does deep learning work well despite the local minima problem?
- How GD avoids the Local Minima problem?
- How do I know whether the solution I found is the Global Minimum or not?
- Why separate training and test sets?
- Why is there a separate validation set?
- What does it mean to say that the test set is contaminated?
- What is Regularization?
- What is the effect of Batch Normalization?
- What is the effect of Dropout?
- What should I pay attention to when actually using BN after learning it? What about code?
- Can BN be applied to the generator side of GAN?
- How would you explain SGD, RMSprop, and Adam to the best of your knowledge?
- What does Stochastic mean in SGD?
- What are the pros and cons of making mini-batches small?
- How about writing down the formula for momentum?
- How many lines would it take to create a simple MNIST classifier in the MLP+CPU version using numpy?
- How many hours will it take to write something that works to some extent?
- How many lines is Back Propagation?
- How much will be added if we change to CNN?
- How many hours does it take to write a simple MNIST classifier in TF, PyTorch, etc.?
- Would it work well if I used MLP instead of CNN?
- Could you explain the last layer part?
- What if you want to learn with BCE loss but see the situation with MSE loss?
- Why is it good to use GPU when doing deep learning?
- I want to use both GPUs. How?
- How do I calculate the GPU memory needed for training?
? python
- What is the difference between lists and tuples in Python?
- What are the key features of Python?
- What type of language is python? Programming or scripting?
- Python an interpreted language. Explain.
- What is pep 8?
- How is memory managed in Python?
- What is namespace in Python?
- What is PYTHONPATH?
- What are python modules? Name some commonly used built-in modules in Python?
- What are local variables and global variables in Python?
- Is python case sensitive?
- What is type conversion in Python?
- How to install Python on Windows and set path variable?
- Is indentation required in python?
- What is the difference between Python Arrays and lists?
- What are functions in Python?
- What is
__init__
? - What is a lambda function?
- What is self in Python?
- How does break, continue and pass work?
- What does
[::-1]
do? - How can you randomize the items of a list in place in Python?
- What's the difference between iterator and iterable?
- How can you generate random numbers in Python?
- What is the difference between range & xrange?
- How do you write comments in python?
- What is picking and unpickling?
- What are the generators in python?
- How will you capitalize the first letter of string?
- How will you convert a string to all lowercase?
- How to comment multiple lines in python?
- What are docstrings in Python?
- What is the purpose of is, not and in operators?
- What is the usage of help() and dir() function in Python?
- Whenever Python exits, why isn't all the memory de-allocated?
- What is a dictionary in Python?
- How can the ternary operators be used in python?
- What does this mean:
*args
, **kwargs
? And why would we use it? - What does len() do?
- Explain split(), sub(), subn() methods of “re” module in Python.
- What are negative indexes and why are they used?
- What are Python packages?
- How can files be deleted in Python?
- What are the built-in types of python?
- What advantages do NumPy arrays offer over (nested) Python lists?
- How to add values to a python array?
- How to remove values to a python array?
- Does Python have OOps concepts?
- What is the difference between deep and shallow copy?
- How is Multithreading achieved in Python?
- What is the process of compilation and linking in python?
- What are Python libraries? Name a few of them.
- What is split used for?
- How to import modules in python?
- Explain Inheritance in Python with an example.
- How are classes created in Python?
- What is monkey patching in Python?
- Does python support multiple inheritance?
- What is Polymorphism in Python?
- Define encapsulation in Python?
- How do you do data abstraction in Python?
- Does python make use of access specifiers?
- How to create an empty class in Python?
- What does an object() do?
- What is map function in Python?
- Is python numpy better than lists?
- What is GIL in Python language?
- What makes CPython different from Python?
- What are Decorators in Python?
- What is object interning?
- What is @classmethod, @staticmethod, @property?
network
- Please explain each layer of TCP/IP.
- Please explain the difference between OSI layer 7 and TCP/IP layer.
- Please compare Frame, Packet, Segment, and Datagram.
- Please explain the difference between TCP and UDP.
- Please compare the headers of TCP and UDP.
- Please compare and explain TCP’s 3-way-handshake and 4-way-handshake.
- Why are the steps different between TCP's connection establishment process (step 3) and connection termination process (step 4)?
- What happens if a packet transmitted before the server transmits the FIN flag arrives later than the FIN packet due to routing delay or retransmission due to packet loss?
- Why do you set the initial Sequence Number, ISN, by generating a random number instead of starting from 0?
- Please explain HTTP and HTTPS and explain the differences.
- Please explain the structure of HTTP request/response headers.
- Please compare HTTP and HTTPS operation processes.
- What is CORS?
- Please compare/explain the HTTP GET and POST methods.
- Please explain cookies and sessions.
- What is DNS?
- Please explain the concept of REST and RESTful and tell me the difference.
- What is a Socket? Please show a simple example of creating a socket in a language you are comfortable with.
- Please explain the difference between Socket.io and WebSocket.
- Please explain the difference between IPv4 and IPv6.
- What is MAC Address?
- Please explain the difference between a router, switch, and hub.
- What is SMTP?
- I accessed
www.google.com
with my laptop. Please explain in detail the process of sending and receiving a request. - Please briefly introduce various network topologies.
- Please explain the subnet mask.
- What is data encapsulation?
- Please explain DHCP.
- Please explain some routing protocols. (ex. link state, distance vector)
- What is ethernet?
- Please explain the difference between client and server.
- Please explain the difference between delay, timing (jitter), and throughput.
operating system
- Please tell me the difference between process and thread (Process vs Thread).
- Please explain why you are using multi-threaded instead of multi-process.
- Please explain the locality of caches.
- Please explain Thread-safe. (hint: critical section)
- Please explain the difference between mutex and semaphore.
- Please explain what a scheduler is and the criteria for dividing it into short-term/mid-term/long-term.
- Please briefly explain the CPU schedulers FCFS, SJF, SRTF, Priority Scheduling, and RR.
- Please explain the difference between synchronous and asynchronous.
- Please briefly explain what your memory management strategy is.
- Please explain virtual memory.
- Please explain the concept and conditions of deadlock.
- Please explain the difference between user level threads and kernel level threads.
- Please explain external fragmentation and internal fragmentation.
- Please explain what Context Switching is and list the process.
- Please explain Swapping.
? data structure
- linked list
- single linked list
- double linked list
- circular linked list
- hash table
- stack
- queue
- graph
- tree
- binary tree
- full binary tree
- complete binary tree
- bst(binary search tree)
- heap (binary heap)
- red-black tree
- b+ tree
? algorithm
- Time and space complexity
- Sort Algorithm
- Bubble Sort
- Selection Sort
- Insertion Sort
- Merge Sort
- Heap Sort
- Quick Sort
- Counting Sort
- Radix Sort
- Divide and Conquer
- Dynamic Programming
- Greedy Algorithm
- Graph
- Graph Traversal: BFS, DFS
- Shortest Path
- Dijkstra
- Floyd-Warshall
- Bellman-Ford
- Minimum Spanning Tree
- Union-find
- Topological sort
Contributors
References
- Datascience-Interview-Questions by zzsza
- awesome-interview-questions by DopplerHQ
- Interview_Question_for_Beginner by JaeYeopHan
- tech-interview by WeareSoft