You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
89 lines
3.5 KiB
TeX
89 lines
3.5 KiB
TeX
\section{An overview of the code}
|
|
The algorithm implement is multi-threaded and written in C\texttt{++}. To avoid redundances, we'll take in exame only the \emph{Actors Graph} case.
|
|
|
|
\subsection{Data structures}
|
|
In this case we are working with two simple \texttt{struct} for the classes \emph{Film} and \emph{Actor}
|
|
|
|
\lstinputlisting[language=c++]{code/struct.cpp}
|
|
\s
|
|
\nd Then we need two dictionaries build like this
|
|
|
|
\lstinputlisting[language=c++]{code/map.cpp}
|
|
\s
|
|
\nd We are considering the files \texttt{Attori.txt} and \texttt{FilmFiltrati.txt}, we don't need the relations one for now. Once that we have read this two files, we loop on each one brutally filling the two dictionaries created before. If a line is empty, we skip it. We are using a try and catch approach. Even if the good practice is to use it only for a specific error, since we are outputting everything on the terminal it makes sense to \emph{catch} any error.
|
|
|
|
\lstinputlisting[language=c++]{code/data.cpp}
|
|
\s
|
|
|
|
Now we can use the file \texttt{Relazioni.txt}. As before, we loop on all the elements of this file, creating the variables
|
|
|
|
\begin{itemize}
|
|
\item \texttt{id\textunderscore film}: index key of each movie
|
|
\item \texttt{id\textunderscore attore}: index key of each actor
|
|
\end{itemize}
|
|
|
|
\nd If they both exists, we update the list of indices of movies that the actor/actresses played in. In the same way, we update the list of indices of actors/actresses that played in the movie with that id.
|
|
|
|
\lstinputlisting[language=c++]{code/graph.cpp}
|
|
\s
|
|
Now that we have defined how to build this graph, we have to implement the algorithm what will return the top-k central elements. \s
|
|
|
|
\nd The code can be found here: \url{https://github.com/lukefleed/imdb-graph}
|
|
\s
|
|
\begin{center}
|
|
\qrcode{https://github.com/lukefleed/imdb-graph}
|
|
\end{center}
|
|
|
|
\subsection{Results - Actors Graph}
|
|
|
|
Here are the top-10 actors for closeness centrality obtained with the variable \texttt{MIN\textunderscore ACTORS=5} (as we'll see in the next section, it's the most accurate)
|
|
|
|
\begin{table}[h!]
|
|
\centering
|
|
\begin{tabular}{||c c||}
|
|
\hline
|
|
Node & Closeness centrality \\ [0.5ex]
|
|
\hline\hline
|
|
Eric Roberts & 0.324895 \\
|
|
Christopher Lee &0.319873 \\
|
|
Franco Nero & 0.31946 \\
|
|
John Savage & 0.316258 \\
|
|
Michael Madsen & 0.314451 \\
|
|
Udo Kier & 0.31357 \\
|
|
Geraldine Chaplin & 0.313141 \\
|
|
Malcolm McDowell & 0.313014 \\
|
|
David Carradine & 0.312648 \\
|
|
Christopher Plummer & 0.311859 \\ [1ex]
|
|
\hline
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
\nd All the other results are available in the Github repository for all the values of \texttt{MIN\textunderscore ACTORS} and for $k=100$
|
|
|
|
\newpage
|
|
\subsection{Results - Movies Graph}
|
|
|
|
Here are the top-10 movies for closeness centrality obtained with the variable \texttt{VOTES=500} (as we'll see in the next section, it's the most accurate)
|
|
|
|
\begin{table}[h!]
|
|
\centering
|
|
\begin{tabular}{||c c||}
|
|
\hline
|
|
Node & Closeness centrality \\ [0.5ex]
|
|
\hline\hline
|
|
Merlin & 0.290731 \\
|
|
The Odyssey & 0.290314 \\
|
|
The Color of Magic & 0.285208 \\
|
|
The Godfather Saga & 0.284932 \\
|
|
Jack and the Beanstalk: The Real Story & 0.283522 \\
|
|
In the Beginning & 0.28347 \\
|
|
RED 2 & 0.283362 \\
|
|
Lonesome Dove & 0.283353 \\
|
|
Moses & 0.282953 \\
|
|
Species & 0.282642 \\ [1ex]
|
|
\hline
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
\nd All the other results are available in the Github repository for all the values of \texttt{VOTES} and for $k=100$
|